Many very important research articles are now unavailable to the public who pays the salaries and associated costs for researchers and their many colleagues down to the level of undergrads and summer research students to:
read the literature and stay current with the new knowledege accruing and the opportunity gaps
write the grants
design the experiements
order the required equipment/reagents
perform the experiments
reduce/analyze the results
iterate back to re-do or extend some of the experiments
summarize the analysis in the context of the current literature
prepare & submit the manuscripts
organize incoming manuscripts during the editorial process
solicit reviews
do the reviewing
pay the page fees
pay the institutional overhead fees to cover the subscription costs.
The public which largely has no access to the product - via the process described above - is essentially supplying the capital to power the current publisher business models largely - as others have said here - providing services (online publishing frameworks; bureaucratic support for the editorial staff; distribution networks to get the final product to the customers) that now-a-days can nearly be performed by a few sharp, motivated college students in their spare time, if the final distributed product were electronic only (e.g., if creating hard-copies of articles were left to the purchaser/reader).
Historically, what gave rise to this current situation?
Coincident with the accelerating government investment in science and engineering that grew during and after WWII, the number of scienctific, technical, and medical (STM) manuscripts began to grow way beyond what the typical professional society not-for-profit publisher could contend with. During the 50's & 60's, as the printing industry began to digitize, these same society publishers were strapped not only to cope with the editorial burden, but also the technical burden of linking in to increasingly digital print workflows.
Larger, for-profit publishing concerns could leverage economies of scale to handle both the increase in manuscripts as well as the need for advanced IT practices in handling that last step in the process (sending electronic proofs to the printer to create hard-copy for distribution). It was at this point - taking advantage of a market need - many of the publishers now charging these large subscription fees came into inheriting the responsibility for documenting and disseminating the collective knowledge of modern civilization - and locked in some significant profits (as sub prices have risen over the last 50 years) based on the value of this service to society.
So - as many have pointed out - these constraints and requirements that brought the publishers into the business have gradually melted away due to the rise both of ubiquitous, richly functional digital word processing & desktop publishing tools and the growth of the web in mid-to-late 1990s as a ubiquitious and user friendly platform for publishing information. Most of those former advantages the publishers had over the smaller publishers (and individuals) regarding the layout, publishing, and dissemination of research articles have disappeared.
As many others have said on this thread, it's a given now the raw published articles (in HTML, PDF, OOXML, etc.) have become very low value commodities. It doesn't make economic sense such commodities which the former slave labor (researchers/reviewers/editors) can fairly easily produce without any assistance from the publishers be able to support large, locked-in revenue streams.
Given that is the case, the question that remains is:
What will the NEW value-added services be that can enable the publishers to continue to make a living into the future?
Superwiz is most definitely correct to point out there is gold in them thar data.
However, it's not strictly true that either Open Access to journal articles "misses the real problem", nor is it true NIH and other organizations are not moving on this issue of Open Access to data.
1) The NIH has a Data Sharing and Access Policy which strives to get such data out there where all can reap the full benefits of mining it.
2) NIH is also committed to funding both repositories and application of algorithmic tools for mining such data (e.g., all of the resources hosted by NCBI such as the Entrez data sets and tools). For some of the more complex data types that are being generated, NIH is funding grants and contracts to help make this data more available.
3) The Science Commons (associated with the Creative Commons) has as one of its primary objectives to create and persistently host a richly expressed repository of public research data (primary data and derived data) specifically to catalyze discovery by the broader community.
This is just the tip of the iceberg. The recognition is there. Some significant technical obstacles still need to be addressed. But I do think the desire SuperWiz expresses here will gradually become a given over the next decade.
I would also add that prior to the 1990's, no research lab made much effort to get their data (raw & derived) out into the "commons". Most didn't think of it as valuable, and there is some truth to the thought that such a deluge would slow - as opposed to hasten scientific discovery.
I believe this view is changing, and we will see the expectation data needs to be published will be a given within a decade.
Peter Suber who maintains the Scholarly Publishing and Academic Resources Coalition (SPARC) Open Access newsletter and the Free Online Scholarship (FOS) newsletter has been following this story for years.
You can find a lot of contextual detail relevant to the discussion by starting with the 11/2/2007 copy of this newsletter.
By far the most elaborate & amazing Rube-Goldberg apparatus ever filmed is "Die Lauf Der Dinge (The Way Things Go) by Swiss artists Peter Fischli and David Weiss (http://www.frif.com/cat97/t-z/the_way_.html).
It includes not only complex mechanical agglomerations but all sorts of homemade pyrotechnical concoctions. These guys really new their inorganic chemistry.
Fischli & Weiss filled a warehouse with dozens & dozens these devices linked in series with the output of one element trigger the next one in the pipeline. The camera just keeps walking down the line following the action. You get the feeling the devices are set up in a large circle inside a huge empty building with the camera in the middle slowly turning to follow the train of activity.
The audio is quite intense, as well. Each device has its own very distinctive sound, which helps to make the video quite animated.
Most incredible of all, they appeared to do it with a minimum of subtle takes. There seem to be only 4 or 5 cuts in this 45 minute video, and some of them require repeated viewing to pick out.
Despite the fact the primary actors consist of auto tires, ladders, plywood sheets and soda bottles, DLDD is remarkably fun to watch. I highly recommend it.
You can pick it up on DVD or VHS at many spots on the net. Here's a link to DVDPriceSearch.com's comparative price listing: http://www.dvdpricesearch.com/cgi-bin/dv dcalc2?cmd =calc&tmpCart=15602
Still - if the goal is to have a time-saving device for managing multiple web servers *and* one capable of leveraging his/her programming experience, Comanche is the way to go and may be *the* reason to become Linux-competant.
He didn't explicitly ask for a tool capable of leveraging his/her programming acumen, I know. But the fact he/she was tempted to "write it" himself/herself, implies to me there is some programming talent on tap.
I've found that the store-bought solutions have a relatively steep learning curve - despite the marketing hype - and are generally too inflexible to allow you to easily customize the interface to meet the specific needs at hand.
This is a niche - shallow learning curve with maximal flexibility/expandability - Comanche fills quite well.
I've started using Comanche and found it to be very powerful.
You can check it out at:
http://www.covalent.net/projects/comanche/
It's open source, XML-based and looks to be pretty easily expanded, if you have some Tcl/Tk, Perl/Tk or Java experience.
I heartily endorse Quebec's comment on the Mandrake 7.2 distribution.
This reviewer does appear to have based his evaluation of Linux installation on the *previous* generation of distros. Otherwise, he's not familiar with Mandrake's biz model.
Let me state from the outset, I am *not* a Mandrake employee, nor have I taken any renumeration for my support of their Linux distro. I'm just a very, *very* happy user.
I should also state at the outset that, "I love the Mac!"
However...
I expect much of my software & OS frustration with the Mac to finally go away, now that they've shifted to a Unix core.
Having said that, the smart & deft folks at Mandrake have done exactly what the Open Source model encourages - carved out a market niche to meet several demonstrable, user needs.
These are - as of the 7.2 distro:
1) Automatic hardware identification: HardDrak is the sort of GUI-oriented, auto identification, hardware config utility Linux has been in need of for a long time. Admittedly, the 'open' hardware spec that comes with the x86 architecture makes this task *much* harder than with a closed spec like the Mac. Still, Mandrake has done a bangup job carving out a very specific need and providing an attractive solution for it. The graphical part *is* important - even for us command-line chauvenist weenies. The human-brain has a distinctly visual bias. It does help to put a pretty face on this complex task. HardDrak will remain a work in progress - as new hardware is constantly coming on line - but I expect it will evolve for the better, as most of Mandrake's tools consistently have done.
2) Simple as pie, graphically-oriented, *smart* installer. By smart, I mean it is able to harvest the value of hardare auto-identification from point 1 above. There are more steps than with the typical MacOS installation, but not by much. Plus, the user doesn't need to bring an intimate knowledge of their hardware into play, as was the case with the previous generation of Linux distros.
3) Very simple printer setup! We have 2 Macs and 3 PCs on our home LAN. We also have an Epson inkjet & Brother laserwriter on the LAN via an Axis network print server. It took me less than 1 minute per printer to configure my Mandrake 7.2 Linux to print to these printers. I still can't quite get the Macs to do it, after *many* hours of trying to hunt down & configure the necessary network printer drivers. The Epson happened to have the required AppleTalk printer driver. The Brother printer must still be accessed by only 1 of the Macs via the 3rd party, USB PowerPrint driver. There are ways of printing to the Brother laswerwrite from a Mac via EtherTalk, but they either cost a lot of extra $$$ and/or require the Mac OS print stream to be first converted to PostScript. The output of these PS conversions not only requires a lot of extra CPU cycles to create, but also needs more printer memory & tends to be lower quality, unless you spend a lot of time configuring the PS driver.
4) With their i586 compile, they meet the need of many Linux users to tweak every last computation optimation out of the newer CPU cores. Maybe we'll even get i686 & Althon specific complies soon, too!
5) Mandrake also seems to have done a good job offering kernel & app development management services via it's 'Cooker' project. Not only does this provide a service to the user community, it also helps the folks at Mandrake roll-out new Linux software to the user community. This is made most dramatic by their announcement today of the MandrakeFreq 'frequent release' service:
http://www.mandrakesoft.com/community/mandrakesoft news/latest
Many very important research articles are now unavailable to the public who pays the salaries and associated costs for researchers and their many colleagues down to the level of undergrads and summer research students to:
The public which largely has no access to the product - via the process described above - is essentially supplying the capital to power the current publisher business models largely - as others have said here - providing services (online publishing frameworks; bureaucratic support for the editorial staff; distribution networks to get the final product to the customers) that now-a-days can nearly be performed by a few sharp, motivated college students in their spare time, if the final distributed product were electronic only (e.g., if creating hard-copies of articles were left to the purchaser/reader).
Historically, what gave rise to this current situation?
Coincident with the accelerating government investment in science and engineering that grew during and after WWII, the number of scienctific, technical, and medical (STM) manuscripts began to grow way beyond what the typical professional society not-for-profit publisher could contend with. During the 50's & 60's, as the printing industry began to digitize, these same society publishers were strapped not only to cope with the editorial burden, but also the technical burden of linking in to increasingly digital print workflows.
Larger, for-profit publishing concerns could leverage economies of scale to handle both the increase in manuscripts as well as the need for advanced IT practices in handling that last step in the process (sending electronic proofs to the printer to create hard-copy for distribution). It was at this point - taking advantage of a market need - many of the publishers now charging these large subscription fees came into inheriting the responsibility for documenting and disseminating the collective knowledge of modern civilization - and locked in some significant profits (as sub prices have risen over the last 50 years) based on the value of this service to society.
So - as many have pointed out - these constraints and requirements that brought the publishers into the business have gradually melted away due to the rise both of ubiquitous, richly functional digital word processing & desktop publishing tools and the growth of the web in mid-to-late 1990s as a ubiquitious and user friendly platform for publishing information. Most of those former advantages the publishers had over the smaller publishers (and individuals) regarding the layout, publishing, and dissemination of research articles have disappeared.
As many others have said on this thread, it's a given now the raw published articles (in HTML, PDF, OOXML, etc.) have become very low value commodities. It doesn't make economic sense such commodities which the former slave labor (researchers/reviewers/editors) can fairly easily produce without any assistance from the publishers be able to support large, locked-in revenue streams.
Given that is the case, the question that remains is:
What will the NEW value-added services be that can enable the publishers to continue to make a living into the future?
This is not a new question.
Superwiz is most definitely correct to point out there is gold in them thar data.
However, it's not strictly true that either Open Access to journal articles "misses the real problem", nor is it true NIH and other organizations are not moving on this issue of Open Access to data.
1) The NIH has a Data Sharing and Access Policy which strives to get such data out there where all can reap the full benefits of mining it.
http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm
2) NIH is also committed to funding both repositories and application of algorithmic tools for mining such data (e.g., all of the resources hosted by NCBI such as the Entrez data sets and tools). For some of the more complex data types that are being generated, NIH is funding grants and contracts to help make this data more available.
3) The Science Commons (associated with the Creative Commons) has as one of its primary objectives to create and persistently host a richly expressed repository of public research data (primary data and derived data) specifically to catalyze discovery by the broader community.
http://sciencecommons.org/
This is just the tip of the iceberg. The recognition is there. Some significant technical obstacles still need to be addressed. But I do think the desire SuperWiz expresses here will gradually become a given over the next decade.
I would also add that prior to the 1990's, no research lab made much effort to get their data (raw & derived) out into the "commons". Most didn't think of it as valuable, and there is some truth to the thought that such a deluge would slow - as opposed to hasten scientific discovery.
I believe this view is changing, and we will see the expectation data needs to be published will be a given within a decade.
PS: Here's a link to the current NIH Open Access Policy:
http://grants.nih.gov/grants/guide/notice-files/NOT-OD-05-022.html
Peter Suber who maintains the Scholarly Publishing and Academic Resources Coalition (SPARC) Open Access newsletter and the Free Online Scholarship (FOS) newsletter has been following this story for years.
You can find a lot of contextual detail relevant to the discussion by starting with the 11/2/2007 copy of this newsletter.
http://www.earlham.edu/~peters/fos/newsletter/11-02-07.htm
By far the most elaborate & amazing Rube-Goldberg apparatus ever filmed is "Die Lauf Der Dinge (The Way Things Go) by Swiss artists Peter Fischli and David Weiss (http://www.frif.com/cat97/t-z/the_way_.html).
v dcalc2?cmd =calc&tmpCart=15602
It includes not only complex mechanical agglomerations but all sorts of homemade pyrotechnical concoctions. These guys really new their inorganic chemistry.
Fischli & Weiss filled a warehouse with dozens & dozens these devices linked in series with the output of one element trigger the next one in the pipeline. The camera just keeps walking down the line following the action. You get the feeling the devices are set up in a large circle inside a huge empty building with the camera in the middle slowly turning to follow the train of activity.
The audio is quite intense, as well. Each device has its own very distinctive sound, which helps to make the video quite animated.
Most incredible of all, they appeared to do it with a minimum of subtle takes. There seem to be only 4 or 5 cuts in this 45 minute video, and some of them require repeated viewing to pick out.
Despite the fact the primary actors consist of auto tires, ladders, plywood sheets and soda bottles, DLDD is remarkably fun to watch. I highly recommend it.
You can pick it up on DVD or VHS at many spots on the net. Here's a link to DVDPriceSearch.com's comparative price listing:
http://www.dvdpricesearch.com/cgi-bin/d
Here, Here!
Good point!!??
Still - if the goal is to have a time-saving device for managing multiple web servers *and* one capable of leveraging his/her programming experience, Comanche is the way to go and may be *the* reason to become Linux-competant.
He didn't explicitly ask for a tool capable of leveraging his/her programming acumen, I know. But the fact he/she was tempted to "write it" himself/herself, implies to me there is some programming talent on tap.
I've found that the store-bought solutions have a relatively steep learning curve - despite the marketing hype - and are generally too inflexible to allow you to easily customize the interface to meet the specific needs at hand.
This is a niche - shallow learning curve with maximal flexibility/expandability - Comanche fills quite well.
Just my $0.02
I've started using Comanche and found it to be very powerful. You can check it out at: http://www.covalent.net/projects/comanche/ It's open source, XML-based and looks to be pretty easily expanded, if you have some Tcl/Tk, Perl/Tk or Java experience.
I heartily endorse Quebec's comment on the Mandrake 7.2 distribution.
t news/latest
;-)
This reviewer does appear to have based his evaluation of Linux installation on the *previous* generation of distros. Otherwise, he's not familiar with Mandrake's biz model.
Let me state from the outset, I am *not* a Mandrake employee, nor have I taken any renumeration for my support of their Linux distro. I'm just a very, *very* happy user.
I should also state at the outset that, "I love the Mac!"
However...
I expect much of my software & OS frustration with the Mac to finally go away, now that they've shifted to a Unix core.
Having said that, the smart & deft folks at Mandrake have done exactly what the Open Source model encourages - carved out a market niche to meet several demonstrable, user needs.
These are - as of the 7.2 distro:
1) Automatic hardware identification: HardDrak is the sort of GUI-oriented, auto identification, hardware config utility Linux has been in need of for a long time. Admittedly, the 'open' hardware spec that comes with the x86 architecture makes this task *much* harder than with a closed spec like the Mac. Still, Mandrake has done a bangup job carving out a very specific need and providing an attractive solution for it. The graphical part *is* important - even for us command-line chauvenist weenies. The human-brain has a distinctly visual bias. It does help to put a pretty face on this complex task. HardDrak will remain a work in progress - as new hardware is constantly coming on line - but I expect it will evolve for the better, as most of Mandrake's tools consistently have done.
2) Simple as pie, graphically-oriented, *smart* installer. By smart, I mean it is able to harvest the value of hardare auto-identification from point 1 above. There are more steps than with the typical MacOS installation, but not by much. Plus, the user doesn't need to bring an intimate knowledge of their hardware into play, as was the case with the previous generation of Linux distros.
3) Very simple printer setup! We have 2 Macs and 3 PCs on our home LAN. We also have an Epson inkjet & Brother laserwriter on the LAN via an Axis network print server. It took me less than 1 minute per printer to configure my Mandrake 7.2 Linux to print to these printers. I still can't quite get the Macs to do it, after *many* hours of trying to hunt down & configure the necessary network printer drivers. The Epson happened to have the required AppleTalk printer driver. The Brother printer must still be accessed by only 1 of the Macs via the 3rd party, USB PowerPrint driver. There are ways of printing to the Brother laswerwrite from a Mac via EtherTalk, but they either cost a lot of extra $$$ and/or require the Mac OS print stream to be first converted to PostScript. The output of these PS conversions not only requires a lot of extra CPU cycles to create, but also needs more printer memory & tends to be lower quality, unless you spend a lot of time configuring the PS driver.
4) With their i586 compile, they meet the need of many Linux users to tweak every last computation optimation out of the newer CPU cores. Maybe we'll even get i686 & Althon specific complies soon, too!
5) Mandrake also seems to have done a good job offering kernel & app development management services via it's 'Cooker' project. Not only does this provide a service to the user community, it also helps the folks at Mandrake roll-out new Linux software to the user community. This is made most dramatic by their announcement today of the MandrakeFreq 'frequent release' service:
http://www.mandrakesoft.com/community/mandrakesof
Mandrake is awesome!
All other distros are merely great.