Dealing w/ Copying of Online Articles via Open Proxies?

← Back to Stories (view on slashdot.org)

Dealing w/ Copying of Online Articles via Open Proxies?

Posted by Cliff on Wednesday December 11, 2002 @12:16PM from the preventing-unauthorized-distribution dept.

Creosote asks: "Concerns about piracy are no longer just for the big commercial media outfits. JSTOR, one of the major repositories and distributors of online versions of scholarly journals, has been hit by crackers taking advantage of open proxy servers to download about 51,000 articles from 11 JSTOR journals. Even nonprofit academic publishers rely on income from publications to exist, so the spectre of large-scale unauthorized copying is legitimately scary to them. In a letter to librarians and publishers, the president of JSTOR notes that while the "threat of open proxies has been recognized for some time in the web community...it does not appear that network administrators, librarians, or content providers are aware that organized efforts are being employed to gain unauthorized access to restricted campus resources" through them. I work for a nonprofit publisher (a university press) that will soon be making peer-reviewed digital projects available online, and they can't all be given away for free, so this hits close to home. Are there better solutions than turning into an attack dog, ala the RIAA and the MPAA?"

34 comments

Min score:

Reason:

Sort:

what is an open proxy by 2040x · 2002-12-11 12:50 · Score: 2

what is an open proxy?
1. Re:what is an open proxy by Anonymous Coward · 2002-12-11 17:06 · Score: 0
  
  A proxy which is not closed.
2. Re:what is an open proxy by .milfox · 2002-12-11 17:32 · Score: 2
  
  A synopsis of why open proxies are 'bad'...
  
  Access restrictions on these materiels are commonly done on a domain limited basis. Many university libraries have a proxy server for folks who attend the university (authorized users) who aren't on campus to access these materiels. There usually is a rather open or trust-based process to access these proxies, and as such they pretty much grant global access to the materiels.
  
  Got it?
Information wants to be free... by xagon7 · 2002-12-11 12:51 · Score: 1

At least thats what I hear anyway. ;)
Perhaps this is the first chink . . . by Anonymous Coward · 2002-12-11 12:57 · Score: 3, Insightful

. . . in the armor of the academic peer review "cartel." Of what relevance is an organization like JSTOR in an age when anyone can publish and peer review could be done electronically. The idea of locking up scholarly papers and charging fees seems perverse to me, anyway--they've already been paid for once by taxpayers or donors to non-profits.
~~~
1. Re:Perhaps this is the first chink . . . by reallocate · 2002-12-11 14:30 · Score: 5, Interesting
  
  How do you know that someone has already "paid for" the papers? Seems to me that charging a fee for a paper is a good way of acquiring revenue to keep the operation going.
  
  And, yes, anyone with access to a web server can publish, but I certainly wouldn't want my papers "peer" reviewed by an amorphous mob of unknown readers. Consider the puerile banalties that pass for comments here on Slashdot.
  
  --
  -- Slashdot: When Public Access TV Says "No"
2. Re:Perhaps this is the first chink . . . by vegetablespork · 2002-12-11 14:59 · Score: 1
  
  For the first, academic institutions are generally non-profits and/or government institutions. Thus, they're taxpayer and/or donor supported. Therefore, the research has been funded by taxpayers and/or donors. (If the research can support itself, then perhaps the schools should sell the rights to it and save their money for teaching and service.)
  With regard to the second, we're not talking about /. moderation here (Heaven forbid). Peer review could still be accomplished with a reputation-based digital signature scheme. Heck, T:n threshold could be used among several distinuguished scholars, and the votes could be blinded. There's no need for these money-grubbing middlemen anymore.
  ~~~
  
  --
  Call (206) 338-5780 COLLECT for information about a genuine BA, BS, MA, MS, MBA, or Ph.D.
3. Re:Perhaps this is the first chink . . . by sethstorm · 2002-12-11 16:55 · Score: 0
  
  Sure, but if you consider the "amorphous mob of unknown readers" anyone in here, anyone outside the Ivy League schools or anyone who resides outside a privately guarded community, then all you are going to get is rehashed versions of the same opinion. If you want elitism in this kind of stuff, fine - just go back and talk to your "closed community" in New England; return only when you've had enough of the same opinions.
  
  --
  Twitter supports and protects racists - by smearing their critics with the "Hate Speech" label.
4. Re:Perhaps this is the first chink . . . by Dahan · 2002-12-11 17:27 · Score: 2
  
  Well, maybe nobody has been paid for the papers... but in any case, why should the publishing company get all the money? The scholars who write the papers don't get any monetary compensation from the publishing companies--the publishers get all these papers for free. They should only charge what it costs to print/bind/distribute.
  As for peer review, the same people doing reviews of hardcopy papers can review electronic papers... I don't see why that has to change. The reviewers aren't getting paid either...
  BTW, arXiv has a good selection of physics, math, and some CS papers.
5. Re:Perhaps this is the first chink . . . by reallocate · 2002-12-12 01:34 · Score: 2
  
  It isn't a wish for elitism, it's a recognition that, in this case, "peer" does not mean the next person who happens to walk in the room. Peer review of a scientific or academic paper means review that is limited to other scientists and academcs with expertise and experience in the same discipline. Whether they review hardcopy or an online version is simply a matter of delivery. (Personally, I'd rather read anything longer than a few paragraphs in hardcopy, not on a monitor screen.)
  
  What I really take issue with is the underlying notion that all information, every created work and every published word, is somehow instantaneously free for the picking to the entire world. That's stretching the notion of open source and free software -- software development models -- into the realms of politics and philosophy. It's a ludicrous stretch.
  
  --
  -- Slashdot: When Public Access TV Says "No"
What?!? by MacAndrew · 2002-12-11 13:00 · Score: 2

Piracy is wrong? (angelic expression)

Er, copyright infringement, because piracy has such a "dirty" sound to it.

And coppyright infringement is either a triviality or a birthright, so the arguments here go.

*

More seriously, I sympathize. I guess the honor system is out?

Ideally, even peer-reviewed work (or, I would hope especially peer-reviewed, because it is significantly value-added) would be in the public domain anyway. The single best approach would be to acquire grant or public funding as a one-time purchase of the data.

After honor system and public domain come the tedious closed-access or copyright-suit methods, which are vulnerable to hacking and piracy, respectively. In case there are further alternatives, I'll be lurking here to hear them.
Check all allowed IPs from open proxies by joebp · 2002-12-11 13:00 · Score: 4, Insightful

From the page: We're sorry. You do not have access to JSTOR from your current location.
It seems they have some whitelist of allowed IPs. Why not just traverse this once every so often and look for open proxies?

Slash said: You can't post to this page.
Another retarded open proxy problem :-(
1. Re:Check all allowed IPs from open proxies by lifeless · 2002-12-11 13:33 · Score: 1
  
  > From the page: We're sorry. You do not have
  > access to JSTOR from your current location.
  >
  > It seems they have some whitelist of allowed IPs.
  > Why not just traverse this once every so often and
  > look for open proxies?
  
  Because this isn't sufficient. There may be two or more hops in an open proxy chain. The authorised IP may be locked down tight, and only allow internal IP's to access it - so it would pass the routine check. But *any* of those internal IP's could also run a proxy server. And that proxy may not be locked down.
2. Re:Check all allowed IPs from open proxies by aminorex · 2002-12-11 14:18 · Score: 5, Informative
  
  Open proxies are crucial to the survival of political
  freedom.
  It's just a wrong-headed approach to access
  control, filtering by IP. The correct approach to
  access control is to require a controlled token
  to connect. An IP address is not a controlled
  token, and using it as one, as JSTOR does, is
  incompetent web service design.
  
  --
  -I like my women like I like my tea: green-
3. Re:Check all allowed IPs from open proxies by raju1kabir · 2002-12-12 11:57 · Score: 2
  
  The correct approach to access control is to require a controlled token to connect. An IP address is not a controlled token, and using it as one, as JSTOR does, is incompetent web service design.
  
  Easy to say, but hard to implement.
  
  When you talk about university campuses, you've got tens of thousands of authorized users that may or may not be in some centralized database. You expect JSTOR to go to each campus, set up a card table outside the cafeteria, and assign IDs to the students, faculty, staff, and other assorted parties covered by their contract?
  
  Or, you expect the universities to all create some uniform authentication database for JSTOR to query against?
  
  I doubt either one is going to happen (though the second, perhaps as a contract stipulation, seems slightly more likely).
  
  --
  "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
It's going to be painful by Anonymous Coward · 2002-12-11 13:04 · Score: 1

If you can't risk that your data is copied, don't publish it, at least not digitally. Others will see an opportunity where you see a threat. We'll have to wait and see who is right in the end. "Information wants to be free" may sound like a naive romantic vision, but there is some truth to it. Think about how much free information your whole life is based on and how many people worked to create that information. Would things really work if information did not have a tendency to break free of restrictions?
Probably no intention to resell by Futurepower(R) · 2002-12-11 13:04 · Score: 4, Insightful

The people who stole the articles probably have no intention to resell them. Probably, they were just doing it because they could. The articles will sit on some hard drive somewhere, and eventually be deleted.

It would be impossible to resell the articles without revealing who stole them. Also, would you want an article from an unknown source, that could have changed it?
1. Re:Probably no intention to resell by zonker · 2002-12-11 19:36 · Score: 0
  
  <jehoovermode=on>you did it didn't you? i know you did because you seem to know too much about this.</jehoovermode>
  
  --
  Large print giveth, and the small print taketh away
2. Re:Probably no intention to resell by ameoba · 2002-12-12 00:23 · Score: 2
  
  Yeah, it'd be hard to make a profit reselling. Even if you could sell burned CD/DVDs of "all research papers on gecko population dynamics from 1970-1998" to grad students, people, knowing they've got pirated data, are going to turn around & continue to pirate it. Considering the ease & low cost of duplicating large ammounts of data, the risk of distribution is going to greatly outweigh any potential rewards.
  
  Besides, if the 'hacking' is as simple as using an open relay to mirror the site, the perp is most likely a skript kiddy that can't understand any of the aricles anyways.
  
  --
  my sig's at the bottom of the page.
Obviously people are not ready for technology by ObviousGuy · 2002-12-11 13:08 · Score: 1

Anyone with content on the web has three choices when it comes to people stealing their copywritten materials.

1) Say nothing and absorb the losses
2) Become aforementioned "attack dog"
3) Take the materials down

Pirates would like nothing more than for content providers to do choice 1. However, that's an unlikely scenario to last for a long time and eventually the content provider will have to resort to either attacking pirates (ala RIAA) or simply take their ball and go home.

At least with #2 the stuff stays online and is accessible to legitimate users of the material. No one wins if the material goes offline.

--
I have been pwned because my /. password was too easy to guess.
Just like any other security issue: by Alethes · 2002-12-11 13:22 · Score: 2

Don't have a single point of failure. Whitelisting IPs for access is great, but just like any other method of authentication, it has its weaknesses and should be used in conjunction with any number of other authentication mechanisms.
The high road by Outland+Traveller · 2002-12-11 13:25 · Score: 1, Troll

I find it difficult to sympathize with people who wish to keep academic journals locked away.
1. Re:The high road by kmellis · 2002-12-11 14:27 · Score: 2
  
  "I find it difficult to sympathize with people who wish to keep academic journals locked away." - Outland Traveler
  Yes. It's interesting and revealing to me that there is a presumption on the part of the contributier--a presumption that seems to be confirmed--that the Slashdot community is more friendly to protecting copywritten scientific papers than they are copywritten music. This is fucked-up moral reasoning.
Secure the origin server properly. by lifeless · 2002-12-11 13:44 · Score: 5, Informative

I'm not sure where focus on IP address issues has come from... but RFC 2616 and RFC 2617 explicitly discuss secure access to WWW entities, and IP address's are not the key.

IP address restrictions are of only limited use, due to HTTP's stateless behaviour. As I've noted in another post, chains of proxies will quickly eliminate any IP based restrictions.

Some steps that JSTOR could take include adding cache-control headers (must-revalidate comes to mind) to prevent cache hits occuring without the JSTOR servers knowledge, and thus allow them to perform partial validation on the actual client (i.e. by checking the Via header). Note that checking the Via header is less-than-secure, but better than simply trusting the customers proxy to be secure.

Secondly, use authentication - assign a username and password to the content, using (say) Digest authentication, which is proxy friendly. Mark the content as explicitly cachable with revalidation, and you will get 1 If-Modified-Since request per download from proxies, and be able to check the user details each time. There would be an administrative issue with this, but I'll leave creative approachs to that as an exercise.
Idea for alternate academic peer review... by Syntari · 2002-12-11 14:00 · Score: 4, Interesting

I wonder if they could shift to a slashdot-type system... Post an article, then let any accredited reader moderate it. Initially, set moderation strength based on number of articles the particular reader has published in the relevant journals (weighting them for prestige of the journal)... after that, set moderation strength according to karma, which you get by posting an article and having it moderated upwards.
One can imagine various enrichments to this model (e.g., allowing a reviewer to go back and change his opinion of the article if he finds he cannot replicate the results in his laboratory), but I think you get the basic idea. Having everything in the open domain will indeed shut down the revenue for academic journals, but that doesn't mean that the time-honored system of peer review has to go down the drain, it just needs to be updated.
(Note: Reviewers who haven't yet published anything, and who do not have tenure at a recognized academic institution, will be awarded zero moderation strength; this is still a closed system for academics, even though it is based on openness. The usual disclaimers for strength of encryption - to ensure no impersonations - apply.)
1. Re:Idea for alternate academic peer review... by spetey · 2002-12-17 07:13 · Score: 1
  
  This is a great idea in the long-term. In the meantime a simpler solution is available for breaking the academic publishing cartel: simply have online journals with a pre-set review panel. For example, see http://www.philosophersimprint.org for an online philosophy journal with a serious editorial board.
#3 Take the materials down by hackwrench · 2002-12-11 14:13 · Score: 1

For some reason I really like #3. They're really not that important and getting rid of them will allow other faces to see the light.
Information wants to be free! by duffbeer703 · 2002-12-11 15:16 · Score: 1

If these so-called publishers were interested in academic integrity, they would GPL these so-called journals make distribute Free knowledge to the entire world.

--
Conformity is the jailer of freedom and enemy of growth. -JFK
Ooh! by Anonymous Coward · 2002-12-11 16:58 · Score: 0

Stealing 51,000 articles through an open proxy? You better believe that's a paddling!
What about ... (ie, no cause for concern) by .milfox · 2002-12-11 17:39 · Score: 2

Trusting that there will be enough people who care about your materiel to support your publication costs?

Fact is, most of these materiels are being sold to university libraries who have open/semi open access to the materiel as a mission statement. These organizations will still subscribe whether or not these materiels are found 'in the open'. For one, they have to behave legitimately, and for another they know that the best way to continue the existence of these materiels would be to keep paying their dues.

At the same time, these dues which constantly come in support you currently, correct? Are you just trying to maximize profit or is there a genuine concern of people switching to non legitimate sources and thus a problem with continued existiece? If it's the former, SHAME! The latter, well, publicise that information. Show real data about not being able to survive and then ask slashdot again.

And who knows, open access seems to work for some academic publishers, who know that the dues to continue their existence will come in because people care to support the content. Maybe it'll even work for you.

For a commercial example of the same, do check out the baen free library and what it has done to their sales. (www.baen.com)
Open Peer Review by Usquebaugh · 2002-12-11 17:56 · Score: 2

On google

Contrary to what the poster asks I feel the peer review process could be best served by using an open model. Most reviewers give their time for free. Most of the cost of a journal goes to publishers and printers.

A collabaritve method would seem to benefit the authors, the reviers and the readers. In fact the only losers would be the publishers/printers.

I'd love to read the latest journals but not at the prices they are asking.
DRM by jbolden · 2002-12-11 19:55 · Score: 2

Are there better solutions than turning into an attack dog, ala the RIAA and the MPAA?

This is essentially the argument for DRM. You want to be able to provide electronic information but in a way that it cannot be duplicated at will. Both Intel and Microsoft are working hard on making this possible and within a few years better solutions will exist than exist now.

So the short answer to your question is "sort of, but in practice not for another 2 years or so". I'm sure other posters will address the sort of solutions. If you want to know what's coming Palladium FAQ.

The more important issue as an academic press is where you are going to stand on the right to read. Academia depends on a relatively free flow of information that is inexpensive. By its very nature what you are asking to do is be able to control the downstream flow of information.

You may find that when the technology is available it is rejected by the academic community. You'll then have to decide if you are primarily a commercial agency providing digital content like Disney or Time Warner; or primarily an academic agency which supports freedom of information exchange even at the cost of lost sales.

Anyway I suggest the following essay on the moral issues. the right to read.
Just an idea... by Spudley · 2002-12-12 01:59 · Score: 2

Are there better solutions than turning into an attack dog, ala the RIAA and the MPAA?

Here's my solution: publish the articles online with deliberate errors. Make sure that people who download them legitimately know what those errors are, so they can account for them, but people doing big bulk downloads as described in the question won't know (and probably won't care). Then they can publish it as much as they like, but they'll soon realise they've got themselves a dud.

Just an idea. Whether it's practical in your situation is another question - I don't know enough about what you're doing to answer that one.

--
(Spudley Strikes Again!)
Available for download? by Inominate · 2002-12-12 03:31 · Score: 2

"Even nonprofit academic publishers rely on income from publications to exist, so the spectre of large-scale unauthorized copying is legitimately scary to them."

So let me get this straight,
There are a large number of these publications available for download by people within certain IP blocks. These are available to be freely downloaded by anyone in those blocks. And people are using proxies to download them? There is no other security to prevent unauthorized users from accessing the site?

Am I the only one who sees something wrong here?

I especially like the use of the word "cracker". The big bad hacker used open proxies to hax0r your download page! Seriously though, if you're counting on IP-based authentication for networks you don't have control over, you're BEGGING for problems. IP based authentication only works under the premise that the machines with those IPs can be trusted.