51% of Internet Traffic Is "Non-Human"
hypnosec writes "Cloud-based service Incapsula has revealed research indicating 51 per cent of website traffic is through automated software programs, with many programmed for malicious activity. The breakdown of an average site's traffic is as follows: 5% is due to hacking tools looking for an unpatched or new vulnerability within a site, 5% is scrapers, 2% is from automated comment spammers, 19% is the result of 'spies' collating competitive intelligence, 20% is derived from search engines (non-human traffic but benign), and only 49% is from people browsing the Internet."
http://funpics.classicfun.ws/var/albums/Funpics/On%20the%20Internet%20nobody%20knows%20you're%20a%20dog.jpg?m=1300661194
"only 49% from people browsing the Internet." I wonder how much of that 49% is porn.
"If any question why we died, Tell them because our fathers lied."
I knew it!
My blog
Which of those categories do data analysis and aggregation tools fall into?
I'm thinking of user-focused tools like RSS Readers, Stock Quote graphers, etc... They're automated non-human tools which access websites, but it's not clear how they are being categorised...
"Go to CNN [for a] spell-checked, fact-checked summary" -- CmdrTaco
the article seems to be about websites, not the intetnet
Hey, now, I know the United States isn't exactly the only game in town anymore, but you guys could be a little more sensitive.
... PORN?!?
It says right there in the summary: "only 49% from people browsing the Internet." Although you could argue that it's higher than that since spiders must crawl through porn too. Adding the 20% for the search spiders, we have that 69% of web traffic is porn related. A fitting number, I dare say.
If you run wordpress for your site... It's more like 50% Bots (search engines), 40% Comment Spam, and 8% Content Scanners and 2% Visitors....
The summary is ok, but the title is completely wrong. It could well be 51% of HTTP requests, but far as 'Internet traffic", it's probably a tiny fraction of a percent.
In fact, why is it even surprising or newsworthy that 50% of HTTP requests are malicious? Anyone who runs a public web server will be able to see that pretty quickly (though as long as it's configured correctly the actual traffic will be tiny (consisting of a whole bunch of 404's).
4chan -- where the men are men, the women are men, and the children are FBI agents.
Any webmaster should already know this, probably way more than 51% for websites in existence for several years.
Agreed. I was thinking only 51%? I currently toss roughly 65% of my logs out when I'm calculating how much human traffic we've received.
William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
They're not saying 50% are malicious, just non-human. I get a fairly large chunk of traffic from google's bots, which I don't consider malicious.
William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
Seriously. Do they have liberal arts majors writing the headlines at /. now?
I am becoming gerund, destroyer of verbs.
If you get a fairly large chunk of traffic from Google's bots, then you must have almost no *actual* daily traffic :)
Try using a calendar which has next month and year links (along with every day therein) and doesnt know googlebot is coming.....gigs. seriously.
The internet is dangerous, buy our security product.
Here's the original ZDNet blog post. It's a longer article with more detail; it's also linked at the bottom of TFA, which seems to have plagiarized it. Compare the first paragraphs:
[TFA] Cloud-based service, Incapsula, has revealed research indicating that 51 per cent of website traffic is through automated software programs; with many programmed for the intent of malicious activity.
[ZDNet] Incapsula, a provider of cloud-based security for web sites, released a study today showing that 51% of web site traffic is automated software programs, and the majority is potentially damaging, — automated exploits from hackers, spies, scrapers, and spammers.
The sentence structure and order of ideas is identical, and many phrases are the same or nearly the same. A high schooler should do better. Minor rephrasing is not sufficient.
That said, both articles are pretty much advertisements. The study doesn't appear to have attempted to actually be comprehensive (so it only used data from this one company). The point was apparently to give this cloud service provider some selling points for businesses to use their service to "secure" their sites. This story is yet another that shouldn't even have appeared on /.; shame on the editors who let it through.
Incapsula, a provider of cloud-based security for web sites, released a study today showing that 51% of web site traffic is automated software programs, and the majority is potentially damaging, — automated exploits from hackers, spies, scrapers, and spammers.
and it just so happens that Incapsula has the perfect solution to save you from all this... for a price.
Anons need not reply. Questions end with a question mark.
You'd think a Slashdot poster would know the difference between the web and the Internet. Sigh.
Bots send short emails usually for throughput reasons. Why waste bandwidth when you are both trying to use little enough so you don't get caught and your peak email send rate is inversely proportional to content size.
Another tidbit that I'm sure a bunch of people know but is worth throwing out there: spam with images, there is a reason for that. The images round trip to the spamers servers. Usually they set it up so that your email account is tagged somehow in the url that your viewer sends to their server. So opening the email "calls home" and tells the spammer "hey I got a real email addreess" (and likely someone gullable enough to look at spam). The spammer can then add your email address to a list of "live email accounts" which sell easily for 10X what a list of unconfirmed email addresses do. So ... if you don't recognize the sender don't even open it even if it is just your webmail client. If you do expect more spam.
Last time I checked whenever I sent any data across the net, it was not human, but rather data.
With IPv6 no more wholesale scanning of the entire global address space in minutes time looking for expliotable hosts. No more 5 minutes to ownage of unpatched PCs and the associated waste of bandwidth.
No more self propogating worms using simple algorithms to divide and conquer the global network.
In the grand scheme of things it won't help much but better than nothing.
Thanks, but I prefer "This." ;)
Agreed. I was thinking only 51%? I currently toss roughly 65% of my logs out when I'm calculating how much human traffic we've received.
The interesting thing is that 51% is identifiable as bots. What about bots that are designed to emulate real users?
I mention that because I have written some bots that are designed to emulate users as closely as possible, so as to not be noticed by paranoid webmasters. Mine follow valid workflow scenarios, and even pause appropriate amounts of time between post backs, so I am fairly certain that they have gone unnoticed.
I don't think that I am more clever than the average hacker, so I am sure that others are doing this sort of thing, too.
HA! I just wasted some of your bandwidth with a frivolous sig!
Perhaps you're just considering a specific country. According to Wikipedia, the overall world sex ratio is 101 males to 100 females. At birth, the ratio is more like 106 males to 100 females, though males die earlier than females, especially in their last years. (An aunt who used to be a delivery room nurse told me that female babies are generally stronger than males, so eg. a premature female has a higher chance of surviving.) Some cultures don't like girl babies, leading to infanticide or abortions, so the ratio can get artificially skewed; it also just seems to naturally vary a bit.
Almost everyone dies in their last year....