Britain's Conservatives Scrub Speeches from the Internet
An anonymous reader writes news of an attempt to erase a bit of history. From the article: "The Conservative Party have attempted to delete all their speeches and press releases online from the past 10 years, including one in which David Cameron promises to use the Internet to make politicians 'more accountable'. The Tory party have deleted the backlog of speeches from the main website and the Internet Archive — which aims to make a permanent record of websites and their content — between 2000 and May 2010."
Where's the torrent file?
People have used robots.txt to buy up domains they want to censor.
For example, this happened with partyvan.
How did they delete them from archive.org? Did they hack it?
Liberty - Security - Laziness - Pick any two.
http://www.huffingtonpost.com/2013/07/26/obama-whistleblower-website_n_3658815.html
Google cache etc will ensure every public speech made since the late '90s is kept forever and many made before that will also be indelibly etched into history.
The Internet is for Orwell...
Appearently porn will have to take second place to political power...
strei . . . you know what, screw it. Let them shoot themselves in the foot.
I read TFA and all I got was this lousy cookie
Because that's what they did in that book.
Laughter is the Spackle of the Soul.
Lucky they now have secret blacklists at every major UK ISP to block these. Think of the children that would be harmed by reading these speeches!
FTFA:
In a remarkable step the party has also blocked access to the Internet Archive's Wayback Machine, a San-Francisco-based library which captures webpages for future generations, using a software robot that directs search engines not to access the pages.
because they broke almost all of their pre-election promises.
The most important thing to learn about the Tory party in the UK is that, contrary to popular opinion, it is not the party for the responsible, the capitalists, nor the hard-working (except in the sense that they want most people to work hard for them). It is a party representing a few wealthy individuals, and their mission is not small government, but privatised government, where nothing happens without their masters getting a cut.
Sorta like a mafia.
“He who controls the past controls the future. He who controls the present controls the past.” George Orwell, 1984
See my journal for slashdot ID's by year. Mine created in 2005. http://slashdot.org/journal/289875/slashdot-ids-by-year
This is not accurate. Speeches made in Parliament are archived in Hansard for a start. And there is no changing that.
No, but the Wayback Machine always respects takedown requests. Note that the British Library maintains an archive of UK sites, and still has the speeches in question (from April 2008 onwards):http://www.webarchive.org.uk/wayback/archive/20080410100951/http://www.conservatives.com/tile.do?def=news.speeches.page
Windows is like the faint smell of piss in a subway: it's there, and there's nothing you can do about it.
There's a theory out there that states that because most of what we do in the so-called Information Age is stored is somewhat fragile digital storage systems (as opposed to, for example, parchment) historians in the future will have very little to base their research on about our age, as most of the info will be permanently lost.
Well, hundreds of thousands of posts on BBS systems from the 80's and 90's are already gone, delete the Internet Archive and the Web is gone too, any thoughts?
How'd they do that? Do they make a copyright claim on the record of speeches they made in public?
It's not even a takedown request. IA will honor robots.txt totally and retroactively - if they have 10-15 years of archived data at a specific domain (or subdirectory on that domain), and someone puts up a robots.txt disallowing them access, not only will they refuse to archive it going forward, but they will remove all previously archived material from being viewable (I hope they don't actively remove it from their archive, but merely stop making it available).
FC Closer
Archive.org will retroactively enforce a new robots.txt to all previously archived content.
This has been used by people who buy up domains targeted for censorship. This has happened to partyvan, for example, when it expired and bought up and turned into a honeypot for watching 4channers and Anons. When this happened a new robots.txt was put in place and all of partyvan's history was deleted from archive.org.
Someone else put a backup of partyvan up on github in response.
They're going to sell them and turn them into an asset.
Astro
they dont want to be called out on their broken promises and outright lies
call me Mister Obvious
Politics is Treachery, Religion is Brainwashing
Are they legally allowed to remove things from archives like that without any prior notification or permission?
The post is misleading. The Conservative website now has a "robots.txt" file which is designed to prevent search engines like the Internet Archive from archiving current and future content. They did not delete previously archived content from the Internet Archive.
Basically, the robots.txt convention is based on politeness. It merely lists directories and files which "honest" search engines agree to not search through. There's nothing actually stopping anyone from ignoring these requests and searching those "disallowed" directories anyway.
We as humans are not able to "remember" back further than 100 years. I mean that you cannot get any information from anyone that would give you a clear, practical understanding of the mindset from 100 years ago. You can go ask your grandparent(s) things about the past, but the vocabulary that they use more than likely won't fit your vocabulary and therefor you will not be able to get the understanding that they're trying for. Maybe 100 years is to small, but it can't be far from the real number, plus it's nice and round ;)
In this way, our society(s) are going through life sorta like that movie Memento. All that has to happen is a slight variation of the real story, that would produce the same basic result, but with a new context - Christopher Columbus "discovered" America comes to mind. Perhaps the powers that be depend on this, and are looking to make that number (100 here) smaller.
Politics; n. : A religion whereby man is god.
Every bloody time I see some inane web related thing being done by obvious technically impaired judgement here in Britain, it's associated with the same name. Point in case: David Cameron is a bloody idiot.
In the U.S., politicians post speeches full of lies online, and nobody cares. I'm not sure if this is because everybody believes the lies, or because nobody believes the politicians.
http://www.seattlepi.com/national/article/Rumsfeld-denies-making-claims-Iraq-had-WMDs-1202942.php
http://www.youtube.com/watch?v=CU0m6Rxm9vU
It's amazing that politicians want to delete their broken promises from the web so they can lie more effectively. Good thing there are libraries that keep old newspapers, and hopefully always will.
Indeed this is ridiculous that the IA would retroactively remove stuff though as you say hopefully just disable access instead. Even then, why would they keep stuff they aren't displaying? It's an 'archive' and should reflect how stuff 'was' at the time; legalities of that obviously being quite murky and hard to defend against expensive lawsuits, but still.
People in cars cause accidents....accidents in cars cause people
The problem is career politicians who accept campaign contributions and favors. Many people call them bribes. The problem is it corrupts politics and ultimately serves themselves, not the people.
We need term limits for US congress!
The only possible way to start getting out of this mess are Liberty Amendments!!! Check it out.
Actually no, those speeches don't seem to exist on the party website now either.
People in cars cause accidents....accidents in cars cause people
get with the pogrom.
Sounds like somebody needs to archive the archive.
-----
Sorry, I'm only a 1336 h4x0r.
I apologize for my mistake. Until just a few minutes ago, I was unaware that the Internet Archive agrees to RETROACTIVELY honor a robots.txt file. So once a robots.txt file restricts access to content, they voluntarily remove access to previously archived content from the archive. Here's the related item from their FAQ:
Some sites are not available because of robots.txt or other exclusions. What does that mean?
The Internet Archive follows the Oakland Archive Policy for Managing Removal Requests And Preserving Archival Integrity
The Standard for Robot Exclusion (SRE) is a means by which web site owners can instruct automated systems not to crawl their sites. Web site owners can specify files or directories that are disallowed from a crawl, and they can even create specific rules for different automated crawlers. All of this information is contained in a file called robots.txt. While robots.txt has been adopted as the universal standard for robot exclusion, compliance with robots.txt is strictly voluntary. In fact most web sites do not have a robots.txt file, and many web crawlers are not programmed to obey the instructions anyway. However, Alexa Internet, the company that crawls the web for the Internet Archive, does respect robots.txt instructions, and even does so retroactively. If a web site owner decides he / she prefers not to have a web crawler visiting his / her files and sets up robots.txt on the site, the Alexa crawlers will stop visiting those files and will make unavailable all files previously gathered from that site. This means that sometimes, while using the Internet Archive Wayback Machine, you may find a site that is unavailable due to robots.txt (you will see a "robots.txt query exclusion error" message). Sometimes a web site owner will contact us directly and ask us to stop crawling or archiving a site, and we endeavor to comply with these requests. When you come accross a "blocked site error" message, that means that a siteowner has made such a request and it has been honored.
Currently there is no way to exclude only a portion of a site, or to exclude archiving a site for a particular time period only.
When a URL has been excluded at direct owner request from being archived, that exclusion is retroactive and permanent.
couple that with the google cached copy of the site has a 'search for speeches' section which now is, interestingly enough, missing as well.
People in cars cause accidents....accidents in cars cause people
So there's no actual internet archive? How was this not planned for years ago?
That really sucks. And explains why I've not been able to find older versions of my own websites.
Students usualy want to hide F's. Don't want to look stupid.
Wonder what these conservatives are trying to hide? Not much point trying to hide their stupidity. Everyone already knows that about them.
Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
Is that actually supposed to be something easy to find as it is? I'd love to read some of my old papers from my school years! Is there some kind of big archive somewhere??
Sounds like somebody needs to archive the archive.
Ah yes, but then will need an archive to archive all the archives that are not archived in the archive. Are you starting to see the problem?
Surely, they have a copy somewhere.
It fully explains it. Someone bought up the domain that you were hosted on previously, added a blanket disallow in robots.txt, and suddenly all your old stuff is gone.
FC Closer
I'm pretty sure I read somewhere that they have a lot of stuff that isn't publicly available on their website for one reason or another. Don't have a citation though.
FC Closer
Indeed this is ridiculous that the IA would retroactively remove stuff though as you say hopefully just disable access instead.
I think the archive actually does just suppress access rather than purge the actual data, so they can again display it once copyright runs out (if it ever does...).
I also think the point is that newbies may not know about robots.txt and that even an experienced webmaster might accidentally allow access to something private long enough for it to get archived, or receive and honor a takedown notice, so this allows the correction of the error.
It's an 'archive' and should reflect how stuff 'was' at the time; legalities of that obviously being quite murky and hard to defend against expensive lawsuits, but still.
That's why. They have limited funds and need them to buy more disks and stuff, not fight lawsuits. If the choice is not display some stuff or go broke and not display anything, the choice is also obvious.
I wish, though, that they were able to detect when a domain changed hands and not honor robots.txt requests retroactively past the boundary. IMHO a new owner is a new web site that happens to have the same name.
Especially: I wish domain name parking sites didn't put up robots.txt files that cause the archive to immediately purge/hide the previous owners' content. I've lost access to a lot of content from dead sites that way. (It also keeps the owners from rescuing their old content if they don't have personal backups.)
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
what do you have to fear? 8)
"I believe in Karma. That means I can do bad things to people all day long and I assume they deserve it." : Dogbert
No, I added the robots.txt myself :-\
The domains are still mine, just took them with me to the different webhosts I've been working for.
OTOH, nothing of value has been lost, just wanted to know exactly what I wrote about Seven of Nine 13 years ago.
In fact, it was predicted. It was a particularly sharp observer of English politics who coined the phrase "memory hole".
Welcome to the Panopticon. Used to be a prison, now it's your home.
Speeches deemed double plus un-good must be corrected.
controls the past. He who controls the past controls the future.
Captcha: Ruthless
Not much of an archive if they delete the past because someone says it should be deleted. Even Wikipedia allows you to go back and see all changes to an article.
my karma will be here long after I'm gone
This can only completed with the complicity of a corrupt Press. Which is not to say that all publishers have an activated altruism circuit. It only acknowledges the idealism promoted by those who believe the Press as a whole comprises a valuable institution.
Personally, I believe the future of history is imperiled more by the fragility and corruption in human political philosophies than by the impermanence of digital storage or the evolution of electronic storage media and formats.
Interesting. Are they going back and censoring Hansard too?
This isn't news or at least not in the way that the author of the original article means. The main opposition party, Labour, had already done the same thing.
AC
Ah yes, but then will need an archive to archive all the archives that are not archived in the archive. Are you starting to see the problem?
That should work, as long as there is a standard.
cheers,
Another flamer/fool/idiot that sees a conspiracy in every hooker that he buys.
They don't delete it, merely delist it from the Wayback Machine. Ask them nicely and I bet they'd turn it into a special collection on the archive proper.
Apparently using the common practice of putting up a robots.txt to ask crawlers to stay away for better political messaging control is actually an Orwellian thought-control power grab. Obviously Cameron was talking about other news/archive sites keeping a permanent record of his speeches, since that is the only way it could work for any party in power. Do we really expect a politically motivated website, of any party, to keep an honest and complete record of its positions and speeches for indefinite periods of time for public scrutiny?
The way this headline read, I half expected to hear about Cameron's administration sending take-down notices to bloggers for dredging up campaign promises, but no, his party's website just updated its robots.txt, sheesh!
--"You are your own God"--
I just tried to complain to my MP about this but it seems he's blocked me on Twitter. I guess that's it then, we are living in a fascist state.
I want a list of atrocities done in your name - Recoil
No, I added the robots.txt myself :-\
The domains are still mine, just took them with me to the different webhosts I've been working for.
OTOH, nothing of value has been lost, just wanted to know exactly what I wrote about Seven of Nine 13 years ago.
Well that is the thing... sometimes are better off lost. Apparently the Internet Archive is testing the "cannot be unseen" principle.
I'm a good cook. I'm a fantastic eater. - Steven Brust
It makes Winston Smith's job at the Ministry of Truth more difficult if there are old archives available..
So there's no actual internet archive? How was this not planned for years ago?
People mistakenly thought the Internet Archive was an actual archive of the internet, instead of the "Internet Archive of Uncensored Things". (until today i was one of these people)
Perhaps now this will either make IA do the right thing, or perhaps someone will step up to the plate.
I'm a good cook. I'm a fantastic eater. - Steven Brust
Try a few decades in the future. After that the data will be gone - the dye in writable CDs and DVDs does not last beyond that and even then when stored in ideal conditions.
Suppose they archive some kiddie porn. And then they let people download it. And those people make donations to them. And there's more than one person involved in administering the archive. That's conspiracy to distribute child porn for financial gain. No shit they censor the archive, dumbass. For self-preservation, if nothing else.
You KNOW of the BBC's destruction of most of its 1960s video archive through the stories of the 'LOST' Doctor Who episodes, but you do not know the reason behind the destruction. The usual filthy shills spread NONSENSE about 're-use' of video tapes, even though the archival shows that were destroyed were either on un-reusable FILM, or on long obsolete BLACK-AND-WHITE video tape standards (the destruction occurred long after the BBC had fully converted to colour production).
The actual reason for trashing the archive at the time was given as "cost"- apparently the Earth's most cash rich broadcaster just couldn't afford the 'massive cost' of storage, and the handful of people employed to look after the archive. Also the entertainment unions actively encouraged the destruction of material that could otherwise be repeated in place of 'new' output. The BBC, at the time of the destruction, had a '3 times only' rule for the maximum number of broadcasts of any given episode of a show. The BBC argued to the public that the archival footage was pointless, since it could never be rebroadcast anyway, and everyone hated old fashioned black-and-white footage.
The REAL reason for destroying the BBC archives was the SAME as the reason that lies behind UK politicians attempting to push previous speeches and manifesto material down the MEMORY HOLE. Remember where Winston worked in Orwell's 1984? Orwell (with the unfortunate real name of Blair) was accused of writing his books as metaphors about communist Russia, but actually Orwell was writing about British Fabian 'socialism'. Stalinist communism was war too crude an ideology to need sophisticated deconstruction. But Orwell was intimated aware of the far more insidious movements within British politics and the BBC.
Orwell's theme is that the sheeple (well, actually Orwell dismissed the working classes as irrelevant and without influence, so his 'sheeple' were entirely the middle-classes) have the memory of a goldfish, so monsters MUST take full control of all media outlets to ensure the sheeple have a CONSTRUCTED view of 'history' at all times. He predicted that we would move from "History is written by the victors" to "History is created entirely as a fiction to suit the current needs of the ruling monsters".
Look, for years the 'Liberal' party in the UK stated in manifesto after manifesto that if elected (and this party has been a distant third for most of the last five decades plus) it would have only ONE major policy, to spend whatever it took to give absolute priority to education. When the Liberals gained real power a few years back, in alliance with the Conservative party, the first act of the Liberals was to raise the cost of university education in England to almost the highest level in the world. Of course politicians have no desire for you to be reminded of what they said and promised only a few years back, let alone decades ago.
To get elected, the Liberals gave a party political broadcast where their leader PROMISED on screen to not raise university fees. How many UK sheeple now remember this broken promise. I'd bet the figure is far less than ten percent and falling.
It so happens that the leader of the Liberals, a vile war monger named Clegg, is actually a Tony Blair loyalist, and while he has the formal description of 'deputy' Prime Minister, he is actually the de facto Prime Minister, while the laughable joke Cameron, the guy who carries the title of Prime Minister, is just one of Blair's dingleberries. Clegg has overseen the most radical attacks on the rights and freedoms of UK citizens in the last 50+ years, and has particularly focused on attacking the young and the poor. You Americans might not see this as odd, since you do not know that every year, at the party conference, the Liberals had seemingly been supporting the exact OPPOSITE agendas. Once in power, the Liberals switched from being more LEFT-wing than New Labour to more RIGHT-wing than the Conservatives.
Without a memory hole constantly eating inconvenient facts, e
That is right up there with the scummiest political moves I have heard of.
They recently made an announcement that contrary to their election pledge (again) they would look to make economic austerity permanent, instead of scaling back on cuts once the economy recovered.
I would have a bet that in a couple of days if the pressure is still on that they'll either claim it was a mistake, or hackers. Who knows, it might even be true.
I could easily believe that they are stupid enough to think that deleting a few pages erases the past.
You can't get something off the internet. That's like getting pee out of a pool, once its in there, its in there.
So they don't even approve of their own messaging? Seems very unconfident. Why should anyone believe in them if they don't even believe in themselves?
Twinstiq, game news
They just slurp them up from an unencrypted http connection when some visitor requests them.
If its a visitor they're particularly interested in, they might go to the trouble to compromise that visitor's PC and/or MITM the SSL connection their PC forms to your server. So even encrypting your traffic won't necessarily stop them from collecting all of your content, just in case it turns out several years from now that your content has some nefarious connection to terrorists or drug dealers, or perhaps some politician they want to blackmail, or (eventually) to some regular citizen with "unpopular" political views who they want to round up or add to secret watch lists for extra harassment/scrutiny.
Archivists will exercise best-efforts compliance with applicable court orders Beyond that, as noted in the Library Bill of Rights, 'Libraries should challenge censorship in the fulfillment of their responsibility to provide information and enlightenment.'
Seems like this may just have slipped past them. Let's make sure they know they need to sort it out... Surely they only removed it from the Wayback Machine, not from the archive itself.
Oh, I'm sorry sir, I thought you were referring to me, Mr. Wensleydale.
Here in Canada, Conservative PM Harper has taken heat lately for breaking all the links on our government's historical archive of the legislation that's been posted for the past decade or two. It's just... gone. The entire archive, except for maybe the past 5 years worth.
That archive is public government information, not Conservative property.
I do not fail; I succeed at finding out what does not work.
I wonder what chance that some of this material is available from other sources? It's a pity to read about destroyed archives. I've wanted to see some of David Attenborough's older material from the 50ies and 60ies that I read was very well made. The BBC had such great material, like the "Voyage of the Beagle", etc. PBS also had a lot of great programming they're also not releasing; stuff I'm sure they have, but won't distribute.
Nice try, but I'm not buying it. I'd like to think as a non producer or non creator of content their liability would be covered under the same principle as ISP exemption. Do the jackboots come for Comcast because someone peer-to-peers or receives via email a picture of Candy Sweet allegedly taken one day before her 18th birthday? I don't think so. Why should the Internet Archive be treated any differently?
Perhaps I am naive, but one thing I am not. I am not wrong on the principle. That is the way it SHOULD be.
Robots.txt should be subject to exactly the same principle as the US Congress. No retroactive censorship, just the way ex post facto laws are unconstitutional.
Come to that, I am not clear on what the legal basis of robots.txt is. To the best of my knowledge there is no legal basis whatsoever for prosecuting non-compliance; i.e., it is a gentlemen's agreement; no more. Am I mistaken?
Especially: I wish domain name parking sites didn't put up robots.txt files that cause the archive to immediately purge/hide the previous owners' content. I've lost access to a lot of content from dead sites that way. (It also keeps the owners from rescuing their old content if they don't have personal backups.)
This.
Today I have lost a lot of respect for the Internet Archive.
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
How does that work?
---- Booth was a patriot ----
more people should frequent reddit.com/r/datahoarder
Torrents only work if the data is available somewhere.
Not much of an archive if they delete the past because someone says it should be deleted. Even Wikipedia allows you to go back and see all changes to an article.
except for page deletion. In that case, only certain people can view the history.
Did you mount a military-grade, variable-focus MASER on an unlicensed artificial intelligence?
Personally, if I ever found myself in possession of child porn, I'd get rid of it as fast as possible. I suggest the same to you. The jackboots come after whoever the fuck they want to, and find evidence of wrongdoing after the fact... if nothing else, "lying to the police during an investigation" will always stick no matter how small the lie. The less child porn you (knowingly or not) distribute, the fewer pretexts the jackboots have to knock down your door and break your server hardware in an "attempt to gather evidence" and shoot the family dog in front of your kids.
You are not mistaken about the legal power of a robots.txt: it's a gentlebot/gentleadmin agreement. But you are mistaken about the implied legal threat behind one. If you want things done the way you want, do it yourself. The only better thing than an archive is two archives maintained by different people in different political climates.
As short a time ago as February, the Ministry of Plenty had issued a promise (a "categorical pledge" were the official words) that there would be no reduction of the chocolate ration during 1984. Actually, as Winston was aware, the chocolate ration was to be reduced from thirty grams to twenty at the end of the present week. All that was needed was to substitute for the original promise a warning that it would probably be necessary to reduce the ration at some time in April. -- 1984
This is a kind of ridiculous excursion off point. Nobody is "distributing" anything in the context of the discussion.
What I find perplexing is why the Internet Archive chooses, on their own initiative, to make an ex post facto censorship based on robots.txt. Robots.txt tells you "be nice and don't process the site". It's enough to stop processing the site when you see the request. It's an absurdity to go back retroactively and undo what was done in the past, before there was a robots.txt request. The Internet Archive's policy is absurd. It's batty.
I certainly couldn't agree more, there ought to be a dozen or a hundred internet archives, geographically distributed. It ought to be impossible for anyone to make data that was once free go away.
I always thought this was built into the Conservative philosophy by definition, at least the far Right. You have principles you try to uphold for as long as possible, by any means necessary.
-Ultimate Stickman Game Developer Infinite World Puzzler
like in a print newspaper or magazine or something.
Many might think "problem"...
I think "opportunity"!
I've written before how the archive can do a lot more than it presently does. For a start, it could be used as evidence. So how do we make money out of this? Well, the archive only goes so far in what it stores. If there is a page that we want to be stored for longer and more securely then we should be able to pay for that. More securely? How's about paying for it to be backed up to a p2p network, similar to SatoshiProof. You then charge a small fee extra for the service.
Of course, you could do the same for following or not following robots.txt but this is closer to messing with the integrity of archive.org
Just make sure this upper tier in the 2 tier system is genuine new functionality.
A blog I run for the wealth
Its probably worth keeping a copy for historical purposes.
I agree that it's ridiculous. The access should be granted with the robot.txt that was valid _at the time_. Just archive it together with the website.
Non-Linux Penguins ?
Pick any one.
Until someone tries to decide if the archive of unarchived archives archives itself or not.
At least in Britain they have to make an attempt to change history. Here in the U.S.A. our attention spans are so short that it doesn't matter what stupid, appalling, ridiculous thing any politician says it will be forgotten by the masses and largely ignored by the media shortly after it comes out. If it's not making the media money, or if it's not a video news clip, or something with small, simple words thats no bigger than a postage stamp to read, no one will even know it ever existed.