Ask Slashdot: Successful Software From Academia?

Defining success as outside of the university...? by ByOhTek · 2011-09-27 03:45 · Score: 2

That seems silly. When I worked in a bioinformatics group as an undergrad, we use a *LOT* of software that was only used inside of a university, partially because the kind of research it targeted wasn't necessarily popular in commercial areas yet, and some because what we used was OSS and many commercial organizations preferred closed sourced alternatives (sometimes for speed optimizations, sometimes for support reasons).

Maybe you should define your criteria as widespread use in the context of the target field, rather than outside of a university?

That being said, I think a lot of it either directly or indirectly (through a third party reimplementation), does make it out.

--
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).

PostgreSQL? by 0racle · 2011-09-27 03:47 · Score: 3, Informative

That work for you?

PostgreSQL

--
"I use a Mac because I'm just better than you are."

kerberos by Anonymous Coward · 2011-09-27 03:47 · Score: 2, Interesting

kerberos, ganglia, folding?

"Widely used" isn't the norm by 93+Escort+Wagon · 2011-09-27 03:47 · Score: 3, Insightful

In this day and age, most good software developed in acadamia tends to get spun into a business venture that makes its academic developers very, very rich. See Google, for example.

--
#DeleteChrome

Possible example: by BlaKnail · 2011-09-27 03:47 · Score: 2

There was this company called Google that came out of some phD students' work. I think it's still around and doing business.

Logical Reason for the Dearth by eldavojohn · 2011-09-27 03:49 · Score: 5, Insightful

The problem with software in academia is that it is often devoted to a sole purpose. It is not a generalized solution -- conversely -- it's often a demonstration of a solution so specific that it's never been done. Hence the awarding of a title to the creator. On top of that the teams are usually small and time is usually tight. It's also usually a side effect of the greater thing, the thesis. It will always take a backseat to the theory.

When software is widely adopted, it is because it has been widely supported and is a more generalized solution to a problem. If it uses hardware, it supports all kinds. If it reads or writes files, it covers all formats. This leads to widespread adoption but also takes a lot of time and a lot of contributions. If you're also working on your thesis, this is a daunting task to work on the side.

Nobody gets their PhD by making a predecessor's implementation support more file formats or hardware. So this is left to the licensing of the originator and the community -- who are often recognized as the real workhorses that go from prototype to actual usable software. That's why you don't find many PhD projects turned instant open source hit.

In bioinformatics , a relatively young field, most consumers of the software work in a lab and the input is fairly simple. But even with simple input they first had to agree on a format (those are just a few of what used to be many). BLAST and FASTA go back to the 1990s and 1980s respectively ... if it had depended on hardware or the constant change of text files like PDF and DOC, I think you can understand how hard it would be for academia -- let alone the originating researcher(s) -- to maintain and support for the community. An open source effort could pick up that slack but then who deserves credit for that work?

--
My work here is dung.

Re:Logical Reason for the Dearth by RecoveredMarketroid · 2011-09-27 04:03 · Score: 3, Insightful

The problem with software in academia is that it is often devoted to a sole purpose. It is not a generalized solution -- conversely -- it's often a demonstration of a solution so specific that it's never been done.

Absolutely true. And much of the software is nearly unusable by anyone else-- it was built by the researchers to validate their own work, not to be used by others. If you've ever tried to use any code generated by grad students, it is often buggy, brittle, inflexible, indecipherable, etc... (I'm a late-stage PhD student, so I've run into this MANY times...) And that's the code that the researchers saw fit to release to the public-- imagine what the stuff that wasn't released looks like.
Re:Logical Reason for the Dearth by boristhespider · 2011-09-27 04:21 · Score: 2

Actually I probably should have mentioned that I think it's a further reason for the problems - the code is often patched together from inherited libraries and routines passed on by the PhD supervisor, who themselves inherited quite a lot of it from their own supervisor. Some code used in large academic projects honestly dates back to about 1970 or before, and hasn't been touched since other than to hack into double precision and hope that that doesn't break something subtle. Since some of these archaic routines are random number generators, it actually can break things quite badly.
There's no way I'd release my codes to the wild even if anyone else found them useful - their brittleness is a mixture of the way they were programmed, aiming directly at a very specific problem with no error trapping if the inputs went slightly outside an assumed region, and the routines that went into them. It would take significant work to clear out all the junk and reprogram it in an even vaguely modern language.
Even then, that language would be Fortran, which would put off quite a lot of developers. In my field, at least, academia is still stuck in Fortran and it's often F77. People are slowly shifting to a mixture of C++ and Python but it's taking a very long time.
Re:Logical Reason for the Dearth by boristhespider · 2011-09-27 06:40 · Score: 2

Haha, I'm in cosmology - one of the "big" CMB codes is CMBEasy, which for almost a decade was the only CMB code written in C++. The author originally "wrote" it by taking an F77 code (CMBFast), about half of which is inlined for speed and which is replete with common blocks with variables arbitrarily renamed for reuse in different routines (the original variable name obviously *also* being used in the routine) and running it through f2c, and then goggling at the result and trying to make sense of it while he rewrote and refactored the lot. It made him a hell of a good cosmologist, and quite a good programmer, too.
Re:Logical Reason for the Dearth by jellomizer · 2011-09-27 08:30 · Score: 2

You hit the nail on the head.
However they tend to suck in different ways.
Open Source: We did the fun coding portions and we got it to work. But we will leave out making a clean UI. It is open source someone may come in and do it at some point... Oh by the way we will not bring in any of your UI changes into our project because we got use to using the program the way it is.
Closed Source: It works... It looks good... Lets hope you never see how we got it there... And we will never touch that function again, it works as best as we can tell.
Academic: We solve this problem, it is good enough for me to get my PHD or keep my funding. I got what I needed lets put it on the shelf if someone wants it they can have it.
Government: Here are a bunch of discrete use cases to be coded as individual modules. The fact that 80% of them are redundant doesn't mean you shouldn't do them or break away from coding each use case as a separate module.
Private Industry: We got the program working mostly feature X doesn't work. However only 10% of the customers are affected by this and 4% will get a refund and 6% will complain but wait for the upgrade. But the time save we can get it to market and gain 20% increase in sales over the life cycle of this version.

--
If something is so important that you feel the need to post it on the internet... It probably isn't that important.

X Windows, Ingress / Postgress by angel'o'sphere · 2011-09-27 03:50 · Score: 2

Subject says it, X was mainly developed at MIT. I guess Ingress and Postgress where originally also university projects.

--
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.

A few... by sl3xd · 2011-09-27 03:52 · Score: 5, Informative

* Kerberos (Widely used, part of Active Directory)
* X11
* AFS (Andrew File System)
* MACH (Used by GNU HURD and OS X)

And that's just a starting sample.

--
-- Sometimes you have to turn the lights off in order to see.

Re:How about... by Ruie · 2011-09-27 03:52 · Score: 3, Informative

And valgrind

Several by PiMuNu · 2011-09-27 03:54 · Score: 3, Informative

I think most of the finite element/multiphysics packages started as research projects, either in university or government labs (some military, some conventional). For studying e.g. electromagnet design, heat deposition by currents /EM radiation e.g. microwave studio. Most of the radioactivation and nuclear shielding simulations used by the nuclear industry for designing radiation shielding are or were academic projects (e.g. MARS, FLUKA, MCNPX).

Re:How about... by staalmannen · 2011-09-27 03:56 · Score: 3, Informative

and LLVM

TeX by WillAdams · 2011-09-27 03:56 · Score: 3, Insightful

Subject of several theses:

http://www.tug.org/docs/liang/

http://www.pragma-ade.com/pdftex/thesis.pdf

https://www.tug.org/docs/plass/plass-thesis.pdf

(John Hobby's on METAPOST http://ect.bell-labs.com/who/hobby/thesis.pdf )

Probably others. More information at

http://www.tug.org/

and

http://www.latex-project.org/

and

http://wiki.contextgarden.net/Main_Page

William

--
Sphinx of black quartz, judge my vow.

Advanced Aircraft Analysis by CompMD · 2011-09-27 03:57 · Score: 3, Interesting

It started out as someone's graduate research project in the late 80s/early 90s, and today it is the #1 aircraft design software tool in the world. Its installed in universities, aircraft manufacturers, aerospace consulting firms, and government and military institutions across the planet.

Disclaimer: I worked on the software after it went commercial.

Few off the top of my head by jfp51 · 2011-09-27 03:59 · Score: 2

Rocks clusters (http://www.google.ca/search?gcx=w&ix=c1&sourceid=chrome&ie=UTF-8&q=rocks+clusters) CHARMM (http://www.charmm.org/) Gaussian as an example of how academic-inspired software should NOT be commercialised (http://www.gaussian.com/)

LLVM by Lally+Singh · 2011-09-27 04:00 · Score: 4, Informative

The backend for quite a few compilers, and a few shader compilers...

--
Care about electronic freedom? Consider donating to the EFF!

Students are short term by pavon · 2011-09-27 04:03 · Score: 2

Is there any list of successful software created entirely inside universities' labs that became widely used?

That is an odd restriction to make. Students are only at university for a short time. If their work during that time turns into something useful then they naturally continue it after they leave, either as a an open source project or as a business venture. This is how it is meant to work, and there are tons of examples of such software.

MATLAB and Maple were both created at universities and later commercialized. Same for SPICE. On the open source side there is Apache, Sendmail, PostgreSQL, and the original implementations of nearly every RFC protocol on the internet.

Mosiac by Registered+Coward+v2 · 2011-09-27 04:03 · Score: 3, Insightful

From Univ of Illinois - it arguably changed the internet from a tool for techies to a new way to do business. One of the problems is if something is really good commercial companies may morph it into products that eclipse the original; but their contribution, when though of as basic research, was invaluable. So the definition of success should not be limited to widely used, popular, or well know; but also include defined a new industry or way of approaching a problem.

--
I'm a consultant - I convert gibberish into cash-flow.

LINPACK/LAPACK/Netlib by Arathon · 2011-09-27 04:04 · Score: 2

right up front: I know about this only because I work for these guys, but...

there's a whole host of Linear Algebra-related software written for high performance computing environments that is attributable largely to various teams of academics throughout the past 30 or so years. It is my understanding that these libraries get used by most anyone doing high-performance computing.

http://www.netlib.org/lapack/ http://en.wikipedia.org/wiki/LAPACK

Re:Linux by Kristian+T. · 2011-09-27 04:07 · Score: 4, Interesting

The title of Linus' thesis is: "Linux: a Portable Operating System" - so yes, it counts.

The real question is, if it is enough that a project can trace it's roots back to a academia - even if >90% was added later and or by developers outside academia. I bet many products considered purely commercial started out started out in the back of the head of students during their studies. Many of those dropped out to build a company rather than stay and write a thesis about it. If you include those, and even consider some studying other majors than CS - your probably looking at the bulk of all software in existence.

--
Run with the lemmings, and you'll get your feet wet.

Re:How about... by pnewhook · 2011-09-27 04:07 · Score: 2

And QNX http://www.qnx.com/

--
Tesla was a genius. Edison however was a overrated hack who liked to torture puppies.

Depends on who you ask... by PSandusky · 2011-09-27 04:07 · Score: 3, Interesting

Frequently the software doesn't start in a given academic lab, so much as it starts somewhere in a given research community and propagates to the academic labs as research needs dictate. ImageJ, for example, started at NIH, but now it's available to all and in use all over the place (including my lab).

Other software is developed cooperatively, and then academic contributions are added as they're needed to enable someone's research. If you run R (the statistical program) and start looking through all the extensions available in CRAN, you'll see tons of additions that have been generated in academic labs and released for use by the wider research community.

I work in biomechanics, and I've seen a few programs come out in that field through largely academic development. AnimatLab began (I think) at Georgia Tech, and I think Cofer et al. are still developing it within the university. OpenSim started at Stanford as an open source musculoskeletal simulation program, and is vastly preferable to the godawfully expensive SIMM, which does pretty much the same kinds of things. OpenSim is still alive and well at Stanford, although the developer network spans multiple institutions, academic and otherwise.

Much as I might wish that I could spend more of my time developing programs and playing with software within the academic sandbox, more often it's simply more practical to cast the nets for software from someone, somewhere doing somehow similar research, and then using the software you find if it's useful to your work, rather than reinventing the wheel in favor of advancing academic software development.

--
"What's the use in being grown up if you can't be childish sometimes?" --Fourth Doctor, "Robot"

rsync by Short+Circuit · 2011-09-27 04:08 · Score: 2

IIRC, rsync was the culmination of its original author's thesis.

--
tasks(723) drafts(105) languages(484) examples(29106)

Ho ho ho by anom · 2011-09-27 04:10 · Score: 3, Informative

FWIW, I'm a PhD student at a reasonably large institution in the US.

Very little of this stuff sees the light of day. The vast majority of software is written simply as a proof of concept for some particular method/system/algorithm in order to get published. Good conferences/journals will typically want not only a well thought out idea, but an idea that you can and have implemented it to some extent, and that it works. That having been said, most of what gets produced is complete and total garbage -- typically just enough code to be able to prove that something runs correctly and in a given amount of time.

Personally, I have written a bunch of junk code during my time here. I'd like to think I know more or less how to write good code after all these years, but writing good, well documented, well tested code takes time we don't have -- writing code is simply a means to an end (publication) -- and so most of the code I write is hasty and ugly. This even applies to code that people say is for "wide distribution".

Before you go hounding on academia however, I'd warn that writing "good code" isn't really the point of what we're doing -- the point is to produce a reasonable method of solving some particular problem or type of problem. Going into bioinformatics for example, there are a whole bunch of problems that involve performing more efficient analysis of certain types of graphs. If a researcher discovers something along these lines, he/she will likely write some junk code to prove that the bare algorithm works, perform some analysis of it, publish it and move on. This may or may not end up actually being a useful improvement -- if it is however, then some implementer whose actual job it is to code whatever medical software might be using this algorithm then has a basic blueprint of how to proceed.

As for some examples of software from academia that have made it out, let me think...

Coverity - static code analysis tool, started at Stanford then moved into being a startup and is now quite successful
PostgreSQL - Originally from Berkeley
Bro (Intrusion Detection System) -- written by a researcher from Berkeley/ICSI -- is still somewhat "in academia", but I have heard of several production deployments

That's all I feel like coming up with right now, but I think the general pattern here is that if/when some piece of software produced in academia is seen to have value in its own right (e.g., away from the original research/publication that spawned it), it typically gets spun off in a start-up or a more concerted effort is given to its development, at which point one can actually spend the time to write good code.

Re:latex ? by tautog · 2011-09-27 04:25 · Score: 2

Blackboard.

*shudders*

Someone tell me their thesis was rejected...

spice by Fnord666 · 2011-09-27 04:25 · Score: 2

SPICE is a general-purpose circuit simulation program for nonlinear dc, nonlinear transient, and linear ac analyses. Circuits may contain resistors, capacitors, inductors, mutual inductors, independent voltage and current sources, four types of dependent sources, lossless and lossy transmission lines (two separate implementations), switches, uniform distributed RC lines, and the five most common semiconductor devices: diodes, BJTs, JFETs, MESFETs, and MOSFETs. SPICE originates from the EECS Department of the University of California at Berkeley.

--
'The tyrant will always find pretext for his tyranny.' - Aesop's Fables

SAS and R by kj_kabaje · 2011-09-27 04:27 · Score: 2

Both SAS and R were originally developed inside academic environments. I'd say they both enjoy a rather wide audience (one FOSS, the other rather on the expensive side).

BIND DNS by egamma · 2011-09-27 04:27 · Score: 4, Informative

I can't believe nobody's said this yet...

BIND

BIND was written by Douglas Terry, Mark Painter, David Riggle and Songnian Zhou in the early 1980s at the University of California, Berkeley as a result of a DARPA grant. Versions of BIND through 4.8.3 were maintained by the Computer Systems Research Group (CSRG) at UC Berkeley.

--
Battlemaster--Game with friends in medival realms

One or Two by UrbanaMan · 2011-09-27 04:46 · Score: 2

the University at Champaign-Urbana lays claim to one or two projects that have some popularity ..

the Mosaic browser and its offshoots Netscape, Internet Explorer and Oracle Screens began there.

Javascript (as part of Netscape??)

Apache web server

Project Gutenburg

and, if 'travelling' across the universe fictionally counts as 'widely used outside of the university' then there is HAL in 2001, that (who?) claims to have been activated at the Urbana campus.

Re:How about... by hazydave · 2011-09-27 04:54 · Score: 2

And Mach (kernel developed at CMU, used in NeXT and MacOS).

--
-Dave Haynie

Spice bred a large family tree by kral · 2011-09-27 04:56 · Score: 2

There were two very different versions of SPICE - SPICE2 was a fortran program, and is the basis for the PC version PSPICE (Microsim>OrCAD>Cadence) and minicomputer version HSPICE, though many newer simulators are based on the code for spice3 re-written by a subsequent Berkeley effort in c. Its legacy in electronics engineering is such that even independently generated simulators (Eldo, spectre) rely on the conventions and methods from SPICE, though incorporating incremental improvements (a new algorithm here or there, and distinguishable mainly by how it differs from SPICE).

--
whatever is - the music is

moodle by ezh · 2011-09-27 05:45 · Score: 3, Interesting

http://moodle.org/

Re:moodle by SteveFoerster · 2011-09-27 06:44 · Score: 2

This. For those who aren't familiar with it, Moodle is a learning management system that was started by Martin Dougiamas as part of his PhD research into how open source software could support a particular type of instructional design. It's become the main open source alternative to commercial behemoths like Blackboard, and a number of prominent universities have adopted it.

--
Space game using normal deck of cards: http://BattleCards.org

Bioinformatics by Vornzog · 2011-09-27 06:13 · Score: 3, Interesting

The 'problem' with bioinformatics is that the field is extremely broad. Unless you write BLAST or one of the big sequence assemblers, your software is only going to appeal to a tiny fragment of an already small bioinformatics community.

I wrote software as part of my Ph.D. that is now distributed world wide. I guarantee you've never heard of it - it sets the standard for how to do certain types of phylogenetic analysis, but almost no one does that analysis.

During my time as a postdoc, I wrote a very simple curve fitting routine and put a minimal GUI on top of it. I am now getting requests from multiple countries to modify it to read in files from their instrumentation. Once again, only the tiniest handful of people care, but for those people, this is revolutionary stuff.

The question here is, how do you define success? Like a lot of the responses to this thread, I wrote a small script here or there to solve my own problem. Turns out, it solved a problem for someone else, too. My best known piece of software was a hack, a one-off script, written in an afternoon, that I got yelled at for even bothering to spend time on, and was only ever intended for my own use. It turned out to be the lynchpin for our project, got published in a peer reviewed journal, and has since gone global. I found out later that one of my undergrad computer science profs had solved the same problem 20 years before I did, in a more elegant way, and published it in a good, but non-science, journal - no one has ever heard of it.

Neither of us had the expectation that our software would amount to much. I would define the prof's work as 'successful' - he published a paper on an interesting academic topic. I would define my software as 'wildly successful' - I got an unexpected publication and a global (if small) user base, along with a reputation for fixing problems that would later get me a good postdoc position.

This isn't really an academia question. The most common advice in the open source community is 'scratch an itch'. Write something to fix a problem you see. If you write good stuff, maybe your code will become 'successful'. Or, maybe your afternoon worth of hacking will just turn into an afternoon worth of experience you can apply to the next problem.

--

-V-

Who can decide a priori? Nobody.
-Sartre

Re:Molecular dynamics by snoop.daub · 2011-09-27 07:09 · Score: 2

Also LAMMPS and DLPOLY, but they are a bit more niche. The ones you mention are used a lot in big pharma these days, for example.

Staying on the chemistry/chemical physics front, quantum chemistry codes like Gaussian all came from academia.

Re:Under sufficiently large definitions of "widely by Weezul · 2011-09-27 07:16 · Score: 3, Interesting

Isn't the first one that comes to mind the world wide web? CERN is definitely academia. I'd imagine many other protocols originate in academia. Any idea about SMTP, Usenet, etc.?

BSD, X11, Mach, PostgreSQL, and SSH were all explicitly academic projects.

There is also a question about what qualifies as academia beyond simply universities and government labs. Linus Torvalds started Linux while a PhD student but later landed in industry. Bjarne Stroustrup worked at AT&T Research when he started C++ but he landed at Texas A&M shortly after.

Virtually all programming languages originate in or near academia : Lisp was MIT. Python was started at CWI. Haskell. OCaml. etc. Among the non-academic languages most originate within huge organizations who's research departments start to resemble academia : Smalltalk was PARC. Fortran and Cobal were IBM. C was AT&T. Erlang was Sony. etc. Java and Perl were seemingly further from academia, but academia's influences upon them abound.

Afaik, all computational libraries used for serious numerical programming, like stock trading, computational fluid dynamics, etc., were developed in academia.

--
The Christian religion has been and still is the principal enemy of moral progress in the world. -- Bertrand Russell

Re:Linux by TemporalBeing · 2011-09-27 07:27 · Score: 2

It started out that way - but by the time Linus graduated in 1997, linux had become a huge thing, and I bet that if he hadn't made it the topic of his masters - he wouldn't have finished at all.

It started out as a method for Linus to access his work on the school Minix computers (source: Just for Fun). He later did use it as part of a Masters project for doing multi-architecture Operating systems, but that's it. It is mostly a development as a personal (prior to 1995, part-time/full-time without pay while he pursued academic degrees) and commercial (since 1995 when he's been paid to work full-time on it) project.

--
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)

Re:Under sufficiently large definitions of "widely by Friggo · 2011-09-27 08:20 · Score: 2

Just to nitpick a bit, Erlang was developed by Ericsson, not Sony.

A few (or perhaps, more than a few) by Arrogant-Bastard · 2011-09-27 09:38 · Score: 3, Informative

Andrew File System - CMU
archie -- Princeton?
CAP (appletalk for Unix) -- Columbia
cops/tripwire -- Purdue
GNU everything -- MIT
Gopher -- Minnesota
Kerberos -- MIT
Khoros -- New Mexico
Mach -- CMU
NNTP -- UC San Diego
Mosaic -- Illinois
sendmail -- UC Berkeley
BSD -- UC Berkeley
RCS -- Purdue
Usenet -- Duke/UNC
tcl/tk -- UC Berkeley
multi-CPU Unix -- Purdue
cu-seeme -- Cornell

I'm sure I'm forgetting quite a few. And of course not all of these are STILL successful, but in their day they made their mark, and often paved the way for other projects.

Yes & no by Weezul · 2011-09-27 14:59 · Score: 3, Informative

As I note upthread, virtually all important programming languages originated in academic-like environments, even if they are officially corporate.

There are I think two revolutionary non-academic programming languages :

- Smalltalk was developed by Xerox PARC, but ultimately created object oriented programming, which certainly used academia to gain traction.

- C was developed by AT&T, but completely revolutionized our world. It's almost surely the most important language ever written. There had been structured languages before. I think Fortran and Cobal were developed by IBM. And academia had all it's research and teaching languages. Yet, it was C that brought structured programming and type-safty to system level programming, previously dominated by assembler. Imho, const is pure genius. C could not help but succeed with or without academia, but AT&T was still a fairly academic environment at that time.

In other words, your classification of generalized academic project doesn't include either afaik, but clearly both can fall under some generalized academia. You could not design C, and maybe Smalltalk too, without thinking deeply about languages from a hybrid academic and industrial perspective. If you pursue a blind industry perspective, you create garbage like PHP or VB.

--
The Christian religion has been and still is the principal enemy of moral progress in the world. -- Bertrand Russell

Slashdot Mirror

Ask Slashdot: Successful Software From Academia?

43 of 314 comments (clear)