The Wayback Machine is Deleting Evidence of Malware Sold To Stalkers (vice.com)

← Back to Stories (view on slashdot.org)

The Wayback Machine is Deleting Evidence of Malware Sold To Stalkers (vice.com)

Posted by msmash on Tuesday May 22, 2018 @06:01AM from the stranger-things dept.

The Internet Archive's Wayback Machine is a service that preserves web pages. But the site has been deleting evidence of companies selling malware to illegally spy on spouses, Motherboard reported Tuesday. From the report: The company in question is FlexiSpy, a Thailand-based firm which offers desktop and mobile malware. The spyware can intercept phone calls, remotely turn on a device's microphone and camera, steal emails and social media messages, as well as track a target's GPS location. Previously, pages from FlexiSpy's website saved to the Wayback Machine showed a customer survey, with over 50 percent of respondents saying they were interested in a spy phone product because they believe their partner may be cheating. That particular graphic was mentioned in a recent New York Times piece on the consumer spyware market.

In another example, a Wayback Machine archive of FlexiSpy's homepage showed one of the company's catchphrases: "Many spouses cheat. They all use cell phones. Their cell phone will tell you what they won't." Now, those pages are no longer on the Wayback Machine. Instead, when trying to view seemingly any page from FlexiSpy's domain on the archiving service, the page reads "This URL has been excluded from the Wayback Machine."

9 of 92 comments (clear)

Min score:

Reason:

Sort:

robots.txt by Thad+Boyd · 2018-05-22 06:06 · Score: 5, Interesting

The Wayback Machine obeys robots.txt, even retroactively. If a site puts up a robots.txt file, archive.org will remove old versions of the site.
1. Re:robots.txt by gnick · 2018-05-22 06:46 · Score: 3, Insightful
  
  The thing about preserving data is that you need to do it before the court order to be of any use.
  
  --
  He's getting rather old, but he's a good mouse.
2. Re:robots.txt by jythie · 2018-05-22 06:50 · Score: 3, Interesting
  
  It is not all that mysterious that such a policy or mechanism exists, but it still highlight's the piece's argument that we need more archives since a single point of failure is, well, a single point of failure. I remember growing up people talking about how 'the internet is forever' and 'once it is out there it is always there', but over the decades one slowly finds more and more things that seem to be gone for good if they fail to be popular enough to keep spreading.
3. Re:robots.txt by rahvin112 · 2018-05-22 07:50 · Score: 4, Interesting
  
  The internet archive (Wayback Machine) does not delete the data for sites with robots.txt that restrict data access. It simply marks the pages as unavailable if it already has them. Now I don't know if they will download new copies once the robots.txt is changed but they don't delete data they already have.
They will delete yours too, if you ask by Anonymous Coward · 2018-05-22 06:12 · Score: 5, Informative

See https://archive.org/about/faqs...
If you want to delete your site from the wayback machine, all you have to do is ask them. They are not obligated to keep any page in the archive, whether it contains "evidence" or not. You can also exclude ia_archiver user agent in your robots.txt, which will prevent your site from being indexed in the first place. This way you will not even have to ask them.
1. Re:They will delete yours too, if you ask by jarkus4 · 2018-05-22 07:45 · Score: 3, Informative
  
  Robots.txt will not work as they started ignoring it (https://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/), but the email method still works.
It wasnt malware by Anonymous Coward · 2018-05-22 06:12 · Score: 5, Funny

It wasnt malware, in the American language it would be called something like a "analytics's and management platform, with realtime reporting and active asset monitoring and protection"
Yep, that's how it works by Anonymous Coward · 2018-05-22 06:40 · Score: 5, Insightful

It is very annoying, but that's how it works. The worst is when a site that is owned by an entity who goes out of business is preserved by the wayback machine, but then another entity gets the domain, puts up a robots.txt and there goes all the history.
For all the good it is doing, it would be so much better if it did not apply robots.txt retroactively. It doesn't even make sense, robots.txt says "bots stay out", which is not nearly the same as "bots, forget whatever you had visited in the past"...
1. Re:Yep, that's how it works by bill_mcgonigle · 2018-05-23 01:38 · Score: 3, Insightful
  
  Almost certainly this is how archive.org manages to not get sued out of existence by malicious litigants who want to hide their misdeeds.
  If you can figure out how to make the legal system non-abusive, let's do that and then I'm sure archive.org will keep all their old crawls available.
  In the meantime let's support them for staying around.
  
  --
  My God, it's Full of Source!
  OUTSIDE_IP=$(dig +short my.ip @outsideip.net)