Crazy Firewall Log Activity — What Does It Mean?
arkowitz writes "I happened to have access to five days worth of firewall logs from a US state government agency. I wrote a parser to grab unique IPs out, and sent several million of them to a company called Quova, who gave me back full location info on every 40th one. I then used Green Phosphor's Glasshouse visualization tool to have a look at the count of inbound packets, grouped by country of origin and hour. And it's freaking crazy looking. So I made the video of it and I'm asking the Slashdot community: What the heck is going on?"
Comment removed based on user account deletion
That's what I thought it was for. Srsly, they're your firewall logs. You should have some clue where inbound traffic is coming from and why. If you've got a webserver serving some sort of information that changes, this could be rss readers hitting your site. Or it could be pings of death being dropped by your firewall. It could be web surfers getting to work and hitting you up for information, or browsers grabbing some active information on your site. It could be googlebots. It could be slashdot hits for all I know. These are just theories, because this isn't my firewall or my traffic.
It's a little wrong to say a tomato is a vegetable. It's a lot wrong to say it's a suspension bridge.
Is this post an advertisement for Quova or Green Phosphor's Glasshouse?
Wait, is this just an advertisement for Glasshouse? The voice in the video on Green Phosphor's website is exactly the same.
What gives?
It's pretty interesting. You can see the countries with the largest botnets in the log... which also seems to suggest that a large majority of the packets are coming from the one botnet... since a good number of them kick in at the same time.
It also looks cool. Which is critical.
So there I was, scribbling down some notes off the PC screen by hand, when I reached for the keyboard and Ctrl-S'd.
Is this guy filtering out backscatter like DNS replication and time updates? If it's from a State agency it's entirely possible that are running a root DNS server on-site (I work st a State agency and we are). Also, what timezone is he in? Knowing that might help explain the spike at 21:00. Is that GMT? Need input!
I judt got a nre Kinesis keybiartf so please excusr ant egregiou typos.
So you have access to these firewalls but you don't know how to go about diagnosing the problem aside from an Ask Slashdot? Am I the only one who's a little baffled by this?
It looks to me like the lines of major activity likely corresponded to major news events or other events that caused people to look at the relevant government agency. Without more data it is difficult to speculate. It might be possible to look at the approximate date (Early September of 2009) and find a specific event that would cause this. Indeed, it might then be possible to actually make a guess as to what government agency the firewall belonged.
It looks like an active attack probably from one source with a number of controlled bots helping out.
The packets from every country at once are probably spoofs sender IP addresses from one or more sources (probably the spike countries).
The spiked country traffic are probably the controlled bots attacking the host actively.
Without seeing the actual packet data it's just a guess though.
Looking at the pop-up labels that show up when you mouse-over the data, there seems to be a huge temporal discontinuity in your data set: right at the first vertical stripe, the displayed date/time labels jump from 2009-09-17 to 2009-09-27. Maybe I'm just misreading the display, but a 10-day discontinuity would seem to account for the anomaly you describe.
It couldn't be that easy, could it?
Yeah, I meant to say that it's also difficult to tell what's going on because you conflated all destination protocols and ports together.
Yes, he knows the firewall and the traffic. The question is - why is there suddenly traffic suddenly appearing from every country in the world at the same time? and again a number of hours later? And again 5 or 6 times? Suddenly there is inbound packets from every country in the world, for an hour or two, then it dies off. For some countries, the first 'stripe' is also the start of consistently higher traffic from that country. Does this mean anything?
I think it might be more useful to know the actual dates, and see if this corresponds with any spikes in spam or virus activity. What would be most useful would be know the dest port number of the inbound traffic, that could give us much better clues as to the reasons behind the patterns.
Ho! Haha! Guard! Turn! Parry! Dodge! Spin! Ha! Thrust!
it means that this is an ad for Quova and Green Phosphor's Glasshouse
Am I the only one who found the five minutes of this video to be about as interesting as listening to a stoned person describe the cracks on the ceiling?
You designed the visualization, buddy. If it's "freaking crazy looking," rather than yielding any useful insight, then obviously you did not visualize it in a meaningful way. You failed, in other words.
But as an earlier poster noted, this is just a Slashvertisement for the visualization tool in question. No doubt it will be quite effective on the kind of people who talk as slowly as the guy in the video.
Breakfast served all day!
And also... "oh my god... it's full of bars"
Fixed that for you.
To do something right, you often have to roll up your sleeves and get busy.
First, we would need to know what kind of traffic we are seeing. TCP/UDP? Web? DNS?
On the other hand, I think you have only partial logs, that would explain many of the blanks on your data. Some blanks are too geometric to be correct, you are probably missing a shitload of data.
You have to take into account that, and timezones. Timezones are the key to this. This is probably some public service that gets hit at regular intervals (root DNS server, webserver holding news/stock/climate or similar information, etc). Timezones would explain the pattern. We would need to check times for each country against a timezone table to see if they correlate.
I'm also pretty sure that if someone took the time to look at the most active countries, and the less active countries, and some groups in between, we would be able to probably determine what kind of traffic this was.
Some people mentioned botnets, and it's a big chance that they have a huge influence on this graphs, again, matching timezones against this graph would help us understand.
I don't know what kind of information does the submitter have on the logs, or how he got them, but if he could post at least a small sample, that would help a lot. /methinks that submitter has a lot to do with the tool he's using, and this is just another slashvertisement.
WTF am I doing replying to an AC at 5 A.M on a Friday night?
(this is a guess, obviously. Full netflow data would tell me more, but only way to be really sure would be a full packet trace)
This just shows that you're being scanned with random source IP adresses (that's why the vertical stripe lights up). It is essentially a check to see if part of the botnet has more firewall access than other parts, or if a loadbalancer directs stuff to different firewalls, or if you have additional BGP uplinks, some of which might not be quite as secure.
Then the real scan starts, which uses the information gained in the first phase to make sure it tests out all the firewalls the target network has. Especially in the case of backup bgp links, where traffic comes in on physically and administratively different lines (say 1 verizon, 1 at&t, if you've got money to burn, and most govt. idiots feel the need to burn money). If the company in addition to the multiple uplinks outsources firewalls to those ISPs (or "security", not knowing what they're buying and getting nothing more than a smug false sense of security), again this is done by too many govt. agencies, you are bound to find holes this way. This uses actual bandwidth, and cannot be done on some networks. So what you're seeing is a disproportionate amount of scanning traffic coming from countries with fast networks and few watchful netadmins (or netadmins that just don't care, in Turkey's case), and many unsecured computers (and dear God, Turks and Russians really do not see any need for virusscanners, but generally you'd see a few other countries in there too. Heh the Russians are probably worried that running a virusscanner will interfere with their development of new viruses)
The regular repeats of vertical lines are probably to rescan reachability information, in case something changed. BGP can be twitchy, especially with incompetent local admins (on the botnet side of the network I mean)
From the (low) speed of the attack you can further deduce that it was an advanced attack, meant to stay below rate limiters, and presumably meant to stay below the radar. And from the resources required to pull this off you can deduce that this was not a lone hacker. Perhaps an organization (these days, tracing source ip's for security attacks almost invariably yields an IP address in far inland China, which is not because the russians have stopped attacking networks, but the Chinese are putting quantity above quality it seems these days).
And frankly, if someone has this kind of patience, generally they will find at least something, even in a well maintained network. Best hope it was only some files left out in the "public" folder or ~username folders. It's a good bet they probed the network security in other ways too (esp. googling), with IP's that will tell you much more about where the attack is coming from (using many hops is possible, but results in very slow page loads. And we're all human)
Btw : looking up a net's country can be done quickly via dns, no need for external company, no need for any tax dollars :
[kimmy@t61 ~]$ host -t TXT 104.79.125.74.cc.iploc.org
104.79.125.74.cc.iploc.org descriptive text "US"
(don't forget to reverse the IP address : looking up 1.2.3.4 is done by host -t TXT 4.3.2.1.cc.iploc.org)
You're trying imagine shapes in clouds, there is no context. Video conference call, maybe? Also, could be synchronization, or backups. Spooky garbage for the tin foil hat crowd, I hear theres a good business in it these days. It's an ad for a 3D graphing service.
The force that blew the Big Bang continues to accelerate.
"I happened to have access to five days worth of firewall logs from a US state government agency..."
"While skimming through my grandmother's cookbook, I stumbled upon a recipe for processing yellowcake uranium..."
"In passing, a close personal friend mentioned to me that he would deploy ~30k troops to a Mideastern country, but he's worried that the local restaurantuers won't serve fresh babaganoush ..."
"While I was talking to a famous adult film star about my successful experiment with cold fusion..."
"I was fighting against an alien invasion of the Soviet Union the other day. Natalie Portman and I prepared a platoon of sharks with frickin' hotgrits cannons on their heads, but the unwelcome overlords kept jumping the sharks..."
You want complaining? How about this: This visualization is terrible.
The video took five minutes to watch and most of it was him rolling over the bars in the 3-D chart so you can see what each of the lines means. If that's supposed to be a useful visual aid, I'll eat my hat. It's bad enough that you have to manually roll over every data element to figure out what it is; scrolling through the graph seemed dead slow. I hope that's not a limitation of the product itself.
Simple labels on the axes of the graph would have been nice. Far be it from anyone to try stick little flags next to the lines to represent different countries. Hell, just color-coding them in a totally arbitrary way would have made the graph easier to read.
BTW, a quick look at the Glasshouse site reveals all their output looks pretty much just like this demo. And there's no evidence that you can export one of their rudimentary 3-D graphs to "pretty it up" in a real 3-D app. Instead, their raison d'être appears to be allowing you to run around looking at these graphs... in Second Life.
I'm sorry, but if you're doing something like plotting fractals, for example, where visual similarity to patterns is the whole point, I can forgive you for coming to the conclusion that "it's crazy looking." If what you're doing is trying to provide a visual to aid in the interpretation of data, then the visual should -- y'know -- aid interpretation. A glance at this graph, on the other hand, reveals nothing; not even what it's supposed to represent.
In summary, Edward Tufte will be rolling in his grave when he dies from looking at this graphic.
Breakfast served all day!
It's an ad for a 3D graphing service.
Indeed, the guy from the graphing service is the same guy who made this.
c++;
I see no reason whatever that it would be necessary to use either Quova or Green Phosphor. Any competent programmer could have sampled the data, used whois to get location, and then used about 1000 different programs to visualize the data just as well. (Like Crystal Reports or Seagate.)
The fact that OP did neither, and is involved at a high level with one of the two companies, makes this whole post suspicious.
My best guess is that OP thought he had discovered a way to freely advertise via Slashdot, and victimized us as a result.
I get enough Spam. I don't need to see even more, on Slashdot. Can this user be blocked?
Uh...a bot net?
That would explain most of it.
Also is he plotting this based on potentially spoofed IP addresses? I'm thinking not just a botnet, but a botnet that doesn't care if it's getting packets back or not. It may not be every country in the world, just a bunch of random IPs coming from zombies which may (or may not) be in far-flung places.
Mal-2
How is the Riemann zeta function like Trump rallies? Both have an endless number of trivial zeros.
It does seem like a type of coordination of interest in the site possibly a bot-net but it could also be due to press releases or other media publications since it is a gov site. You would have to look over many days and not just hours to come up with something conclusive but it is none the less interesting that every country even those in different time zones accessed at the same time and it is odd that the Chinese are interested that much in a US gov site at the same time but I digress. Overall more information is needed and over a longer time frame to make any real conclusions.
A loop, by its nature, continues. If that didn't make sense, start reading this sentence again.
Yes, he knows the firewall and the traffic. The question is - why is there suddenly traffic suddenly appearing from every country in the world at the same time? and again a number of hours later? And again 5 or 6 times?
I get a lot of distributed dictionary attacks like that. Its pretty normal.
http://michaelsmith.id.au
Nice visualization. Wonder if there is some way to do it in real time.
I've done networking and security for a university for the last 10 years. I can guess what this kind of activity would be if it was at my institution. Basically, there are several reasons why every country in the world will suddenly talk to us. They include P2P/Gnutella's, P2P/Swarmcasting, Bittorrent, Skype, P2P-poisoning, P2P-misdirection, and hacker/bot activity.
When we have pulses like you are observing, it is usually BitTorrent.
The Gnutella P2P variants don't usually have that many peers. And, they tend to last for several hours or days.
The various Swarmcasting P2P variants look very similiar to BitTorrent, but again, the users tend to leave them running for hours or days.
A popular Torrent makes connections to hundreds of locations at once, and usually the local user shuts down in minutes (or an hour) when they get their file.
Skype won't be narrow bands. It will be every country in the world talking to you all the time. We have had computers promote themselves up the Skype infrastructure until they are constantly talking to over 600K peers. Of course, it is more normal to see a Skype node talking to 10K to 20K peers, but still Skype won't be bands. Skype raises the floor for the entire graph.
P2P-poisoning would closely match your bands. For several years we observed pulses where every member of a large P2P cloud would attempt to talk to a non-existing IP at our institution. Eventually, we realized that somebody was attempting to render the P2P cloud non-functional by poisoning the P2P community with info on non-existing peers. Of course, since this is a Denial of Service (DoS) attack, this is technically illegal, but we saw it happening for years. But, it appeared to stop a couple years ago (about the time Obama replaced Bush) and we haven't seen any evidence of it lately.
P2P-misdirection is where a cloud will attempt to confuse traffic analysis by throwing out random connections/packets to random IPs. Typically, this misdirection happens all the time, and not in bursts/bands.
Bot attack activity doesn't match your patterns either. We observe several types. None would look like your bands:
- The spoofed attacks will look like every one of your IPs getting acks from a few remote IPs.
- The mapping activity will look like a representative sample of your IPs getting traffic from a few dozen IPs.
- An incoming DoS would have a few of your IPs get (spoofed) traffic from everywhere, but it would be sustained.
- Portscans will only involve a handful of remote IPs.
- The Tag-team SSH password guessing is close. During the last week, we observed about 3000 sources located all over. But, it happens all the time (in the aggregrate), not in bursts. And the sources this week are concentrated in Italy, Poland, Eastern Europe, Colombia, and Brazil. They aren't really all over the world.
So, I'm guessing it is BitTorrent. But, your situation may be way different from mine.
Miles
Vertical stripes may be from spoofed addresses -- nothing from real sources, even botnets, can be that uniform across the whole address space. It would make sense to check how much of traffic comes from unallocated address space, as packets from there are guaranteed to be spoofed. Why would anyone do such a thing? As a direct portscan it would be useless (he can't see the responses), however it might be used as a smokescreen to hide a real portscan or attack from some of those addresses. It may even be an attack that floods the DNS servers with fake responses in the attempt to poison DNS cache, thus redirecting some of the traffic to the attackers' addresses.
Then, after whatever kind of discovery was completed, you have seen some targeted host scans, [D]DoS attempts or actual exploits causing large amount of traffic (horizontal stripes).
Another possibility is that those packets are responses caused by something on your network being coerced into sending packets uniformly to the whole address space. It may be something as stupid as a web page with random redirects, however more likely it is a worm on some of your computers looking for other members of his botnet. After such discovery some hosts joined the botnet[s], producing horizontal stripes composed of traffic from other botnet members.
Contrary to the popular belief, there indeed is no God.
Computers are used by people. People who wake up, work, play, sleep, have weekends, business holidays, religious holidays, events and a pantheon of other reasons why they might act in seeming semi-concert.
You're suggesting that for the five day period in question, the majority of people work up at the same time GMT? Not 7am local time, but 9pm GMT everywhere in the world? Or did you just not actually look at the video (which shows spikes of data from every country in the world at the same time)? "Timezone effects" should eliminate these sorts of lines, not cause them, by spreading that kind of activity out over 24 hours.
"Convictions are more dangerous enemies of truth than lies."
If we assume the video conference included people from all of those countries, who all endeavored to join at the same time GMT regardless of local time, and they keep conferencing for several days without sleeping, then yes, that would account for those horizontal lines that suddenly get thick at the first vertical stripe and continue until the end of the five-day period. That definitely makes sense... ~
"Convictions are more dangerous enemies of truth than lies."
It's elementary my dear Watson. P2P. Someone's firing up Bittorrent (hence, every country in the world with long streams to those actually grabbing data).
This is the guy whose product we're talking about. He wants to explain himself. If you think he tried to use Slashdot to advertise his product, you don't have to mod him up, but if you mod him down to -1 then he'll drop below a lot of people's thresholds and they won't even see that he tried to participate. That's not being fair.
Breakfast served all day!
Bingo. My thoughts exactly.
Unless his gives up some more data, hard to tell for sure.
But, I agree, it sounds like someone is using their employer's (government)bandwidth to torrent. Could be a machine that someone shuts off the monitor on but P2P downloads overnight with a scheduled P2P app.
The peaks/valleys might be explained by reset packets introduced by the ISP temporarily killing the outbound requests and it takes the inbound requests awhile to trickle off.
You can see this same type of log traffic by simply starting a torrent, waiting a little bit, then stopping the P2P client, waiting awhile again, then restarting it. Rinse, repeat and you will see something that looks awfully close to what you have.
Reset packets essentially create the same traffic pattern, but for a different reason (ISP- introduced traffic "shaping").
I would wager that if he was to look at outbound traffic at the same time as the inbound "stripes" he would indeed find a correlation. For example, if you ping some IP address it should send you back a packet of data. Perhaps those strips aren't so representative of everyone else all of a sudden looking at the site but the site looking at everyone else and getting some kind of answer back?
I'm no sys-admin, but it's a logical hypothesis.
Tamran
Ahah. So you are why my costs for bandwidth are so high.
Not sure what it means, but I'm tempted to plug-in Guitar Hero and jam along to your firewall logs.
Just let me finish my Klax game first.
Piss-poor ad though. How many people saw the video and thought "I must get me some of this graphing tool!"? My first thought was "interesting way of presenting information, but his graphing tool is crap".
Ho! Haha! Guard! Turn! Parry! Dodge! Spin! Ha! Thrust!
The graph is kind of misleading, its not actually to scale and its not showing the 5 days he claims in the youtube description. Go to around the 3:05 mark and watch the time stamp when he mouses over Romania. On the far right you can see an early date of 2009-09-15, as he scrolls to the right we can see a date of 2009-09-28 at the second stripe which is roughly in the middle of the graph, continuing on the far right hand side portion of the graph is dated 2009-09-30. The left hand side of the graph shows results over the span of 13 days and the right hand side taking up the same visual space only shows 2-3 days. Basically I just wasted 15 minutes looking over worthless data on a random youtube video that doesn't actually say anything.
A botnet attack? But then the activity shouldn't be concentrated by country, but spread around the world about evenly.
Or it could be that someone's seeding a torrent from behind the firewall. That would explain the suddenly starting continuous activity. It might also explain the concentration by country (language or timezone). It would help if the graph could be organized by such factors.
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
My first thought was "why does everybody have to make everything a video?"
What I find a bit odd is that nobody has even thought to question what business the submitter has with 5 days' worth of server logs from a US state government agency.