Redundant Internet Access?

Not there yet by perlchild · 2004-07-12 15:58 · Score: 5, Informative

I haven't put the "on" to our redundancy just yet, but I can assure you one thing. When I do, two different companies will be providing the circuits.

Having them in two COs, redundant everything, yet linked to the same AS(when it isn't mine) makes me nervous.

Re:Not there yet by Asgard · 2004-07-12 16:01 · Score: 4, Interesting

Beware, I recall a story about how redundant lines were leased from two different companies, only to find that they both leased their lines from the same company and it was all contained in the same conduit .
Re:Not there yet by perlchild · 2004-07-12 16:35 · Score: 1

I'm aware of that story also, and the company we use currently supplies local loops to practically everyone around here, I intend to address that once I get the second circuit(probably by ordering the local loop seperately).
Re:Not there yet by Carnildo · 2004-07-13 06:49 · Score: 1

Mine really is redundant:

Connection #1 is a cable modem with the cabling running in a conduit on the outside of the apartment building. Connection #2 is a regular modem with the wiring running through the building interior. Two separate ISPs, with separate POPs. The only common failure point is the mainboard of the computer functioning as a router -- absolutely everything else is redundant.

Admittedly, the switchover time isn't impressive, and the backup connection is slow, but I've never been without internet service.

--
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.

Actual conversation by BrynM · 2004-07-12 16:01 · Score: 4, Funny

This is what I overheard when the place I worked at years ago was shopping disaster recovery sites. Mind you, this was for a mainframe - this place was supposed to be fully redundant in about 20 other ways as well.

Boss: We need redundant connectivity and power.
Sales-Goof: You can have as many people open browsers on as many computers as you want.

For comparison and not a plug, when my boss asked the IBM guy, he pulled out charts and wiring diagrams to explain what they had.

--
US Democracy:The best person for the job (among These pre-selected choices...)

Re:Actual conversation by PaulBu · 2004-07-12 16:13 · Score: 1

... I overheard when the place I worked at years ago ...

Did they have BROWSERS back then???

Do not take it seriously, just my attempt at a light-hearted joke! ;-)

Paul B.
Re:Actual conversation by BrynM · 2004-07-12 16:40 · Score: 1

We had a Gozilla on top of the mainframe and a poster of dinosaurs next to it. We were running a mainframe in the late 90s, so we got teased a lot ;)

--
US Democracy:The best person for the job (among These pre-selected choices...)

On a Wing and a Prayer by orthogonal · 2004-07-12 16:10 · Score: 5, Funny

To those of you on Slashdot who know what I'm talking about: are your circuits truly redundant? What have your experiences in network redundancy been?

I have two homing pigeons.

If Cupid smiles on them, soon I'll have even more redundancy.

--
Opinions on the Twiddler2 hand-held keyboard?

Very concerned by invisik · 2004-07-12 16:12 · Score: 4, Interesting

I worked at a place that was running redundant T1's just as you describe. They might as well have had all the wires running together the whole way.

My issues from there:

1. How do you convince an ISP to bring a feed in from another CO? Distance is a huge problem--they don't want to run it.

2. How do you know what the ISP has on their end, UPS's, generators, etc? Should that be part of the SLA? Or should you demand a tour of their facilities to see where your wire goes?

3. How can you coordinate two seperate ISP's for automatic redundancy? I suppose with a LinkProof box or something. And how do you know they aren't coming through the same telco CO?

4. Should you pay to have them manage the lines and router configurations in a 24/7 scenario? Or does it work well enough to have them do the initial install and then let it run?

5. Finally, what's a resonable cost for this redundancy?

I have some more projects that will be requiring this type of setup. Am interested to hear any opinions and recomendations from experience from fellow slashdotters......

Thanks much!

-m

--
http://www.invisik.com

Re:Very concerned by duffbeer703 · 2004-07-12 16:28 · Score: 4, Insightful

The local telco will lie their asses off and charge you insanely expensive rates for mediocre service.

Unless you're in a downtown area or a tech park, forget about redundancy.

IMHO, anything facing the public that needs redundancy belongs in a colo.

--
Conformity is the jailer of freedom and enemy of growth. -JFK
Re:Very concerned by redog · 2004-07-12 22:32 · Score: 1

1. How do you convince an ISP to bring a feed in from another CO? Distance is a huge problem--they don't want to run it.
If you can, pressure them. If they can't do it I can't pay them for it and I will find some one else who can.(Don't let them know even if you can't) Or I'd let them know you have found a solution off site that meets your needs, you will no longer need their lines unless they can provide what your paying for.
2. How do you know what the ISP has on their end, UPS's, generators, etc? Should that be part of the SLA? Or should you demand a tour of their facilities to see where your wire goes?
I would demand that it be apart of the SLA so when you do go completly down your company can cease payments for downed times and posibly collect losses.
3. How can you coordinate two seperate ISP's for automatic redundancy? I suppose with a LinkProof box or something. And how do you know they aren't coming through the same telco CO?
When you find a way, let me know.
Re:Very concerned by green+pizza · 2004-07-12 22:54 · Score: 1

3. How can you coordinate two seperate ISP's for automatic redundancy? I suppose with a LinkProof box or something. And how do you know they aren't coming through the same telco CO?

Google for BGP.

You need to get an IP address block that both ISPs are willing to advertise/route for you. This is not a problem if you deal directly with Sprintlink, UUNet/MCI, AT&T, or another Tier 1 provider. Any modern mid-range Cisco (or Juniper) router can handle multiple connections via BGP. The main limitation is RAM, BGP tables take up about 128 MB, so you probably can't (comfortablly) use a low end bargain 2611 or 4500M for your dual T1s.

You'll run into troubles if you use consumer router hardware or try to use Tier 2 ISPs (Joe's Friendly Internet of Miami).

2 to the same provider is not redundant by ZESTA · 2004-07-12 16:15 · Score: 5, Informative

Having "redundant" circuits to the same provider is pretty useless. You really need to be connected to two completely separate upstream providers for decent redundancy. If you have mission-critical needs, you want 3.

-Randy

Re:2 to the same provider is not redundant by slashjames · 2004-07-13 07:42 · Score: 1

I will second this. Use seperate upstream providers if you want it to be truely redundant. Here, I have DSL and Cable for the upstream. They both go (on my end) to a Cisco 1750 (ADSL + ethernet blades). It links to a PIX 515e and that is what the rest of the network sees: just the PIX 515e.

The redundancy is handled by the 1750. We have 5 static IPs with the DSL and 1 static with Cable. Since the DSL is the primary connection, we have the routes for it listed before the route for the Cable. The moment the 1750 figures out it can't transmit along the routes for DSL, it starts using the Cable. When DSL service is restored, it starts using the DSL routes again automagically.

Windows by Anonymous Coward · 2004-07-12 16:29 · Score: 2, Funny

"Have you found yourself paying for something that you weren't really getting?"

Aw! You're making this too easy.

Re:On a Wing and a Prayer by Anonymous Coward · 2004-07-12 16:29 · Score: 2, Funny

Great, now I will mod you up +1 Redundant.

Now where is it? Aha...there...

One Example by Marillion · 2004-07-12 16:37 · Score: 2, Interesting

I know that one major US air carrier has a "six-foot" standard for the parallel OC3 pipes between it's key airports. This is to protect against "Johnnie Backhoe".

Part of the expense was justified by cost savings using VOIP between the stations and the operations centers.

--
This is a boring sig

Re:One Example by pyite · 2004-07-13 01:32 · Score: 1

Chances are that six feet is not going to help you against a backhoe. A 3 cubic yard bucket is pretty big and six feet is just too close for comfort.

--
"Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

Another completely different approach by DDumitru · 2004-07-12 16:42 · Score: 4, Insightful

My personal opinion is that trying to reach this level of redundancy for a lot of companies is just not practical and that there are much better approaches.

The idea here is to think of your internet connectivity as two different classes of services. You should place your internet reachable servers in a good co-lo. Get BGP lines from two different sources and multi-home the boxes. Don't run your own AS (use the upstreams space) but instead place your servers "close" to your provider's edge routers. In the end, you are BPGing the loop and it is hard for 100ft of cat-5 to fail. In the end, you have to ask yourself "Am I more qualified to keep my BPG up than is Level-3 (or Savvis ... or AT&T ... or MCI ... or Sprint ... or Cogent)".

In terms of your office, stick to client-only type services. Get two "diverse" connections. This might be a T-1 and a DSL, or a DSL and a cable modem. By using completely different architechures, you can get incredible diversity without spending a bunch of money. You can then IPSEC your local net over the client-only connection back to your addresses in the co-lo and with the help of a little client-side monitoring, auto-switch when a line goes down.

We offer something similar as a part of our hosting offering for users with green-screen (telnet, serial terminal) applications. A client gateway application manages logical "connections" back to our multi-homed central servers walking around BPG router "flaps" and other transient outages that BGP does not even address.

Re:Another completely different approach by Anonymous Coward · 2004-07-12 16:50 · Score: 0

nice plug
Re:Another completely different approach by DDumitru · 2004-07-12 16:57 · Score: 2, Interesting

Only for those few souls that are still running green-screen. And I really was not trying to advertise (at least not too much) but was trying to get people to think about what is possible.

If anything, my "real" motive is to keep people from putting servers in-house. If your office has the same "pipe", "power", and "security" as a good co-lo, then you spent too much money building it.

After all, there are millions of square feet of unused co-lo at rock-bottom prices just begging for more space-heaters (er, servers) to keep the resident space-heaters (er, servers) company.
Re:Another completely different approach by mwvdlee · 2004-07-12 19:22 · Score: 2, Informative

A lot of companies simply cannot (due to regulation) CoLo their servers (i.e financial institutions).
This is not just due to stability/reliability concerns but mostly security; how would you feel if your banking account was housed on www.cheap-ass-pr0n-servers.biz or something like that?
Don't be fooled; any techy at a CoLo can look at your data if (s)he wants to.

--
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
Re:Another completely different approach by DDumitru · 2004-07-13 03:34 · Score: 2, Informative

You are wrong.

Many banks do run in Co-los. We have neighbors in the co-los we are in that are banks, insurance companies, medical, etc. And I would feel very comfortable with my bank locating in some of the co-los that we are in.

Case in point are co-los with "real" security. Savvis (formerly Exodus) in Los Angeles (actually El Segundo for those that care) has armed guards, card key access, hand scanners, more security cameras than you can count, and man traps. If you need more, you can get private cages, rooms, and even bomb-proof vaults. And if you know of a "regulation" that states that your internet-connected servers cannot be located in a secured facility with bonded armed guards, I would like to read that reg. Personally, I think your statement is just a continuation of the "myth" that co-los are somehow less secure. While no place is 100% perfect, co-los are much "more secure" than the back room at your office.

Remember, that when I say co-lo, I don't mean "cheap-ass-servers.com" (someone quick, go register this as it appears to be available). There are many high-quality co-los from companies like Savvis, Switch and Data, and others that are run extremely professionally. If you are lucky enough to be in a good co-lo, you will get better up-time than you could possibly hope for in-house.
Re:Another completely different approach by silicon+not+in+the+v · 2004-07-13 03:44 · Score: 2, Interesting

Get two "diverse" connections. This might be a T-1 and a DSL, or a DSL and a cable modem. By using completely different architechures, you can get incredible diversity without spending a bunch of money.
There's a friend of mine who developed a database system for some schools to run through their website. He is also providing hosting for them at his house.(after pricing out several hosting services and finding them a waste) The school system bought the server, and he got redundancy by hooking up DSL as primary with cable modem as secondary. The server has both connections plugged in and if it detects the DSL goes out, it will failover to the cable one.

That's a pretty good solution for low cost. I don't know the details of how he set up the server to do the detection and failover, but he has confirmed that it works by disconnecting the DSL line from it and seeing it automatically switch the connection and keep serving on the cable line. It's a Windows Server 2003.

--
We may experience some slight turbulence and then...explode. -Capt. Mal Reynolds
Re:Another completely different approach by Big+Jason · 2004-07-13 14:45 · Score: 1

Aren't man traps illegal? Maybe it's lethal man traps that are illegal...
Re:Another completely different approach by DDumitru · 2004-07-13 18:02 · Score: 1

I have walked thru it many times. It is basically a hallway with opposing doors that cannot both open simultaniously. Guards watch the interior making it difficult to wheel out lots of gear without being noticed. Now, if there were machine guns, they were well hidden ;)
Re:Another completely different approach by gujo-odori · 2004-07-13 19:54 · Score: 3, Informative

Yeah, what he said.

I used to be a network engineer at a large co-lo company which was acquired by Cable and Wireless after going through Chapter 11.

The data center in which I worked had a different take on man traps. They looked very much like a Star Trek transporter, and like the transporters, were temperamental and at least one of them was frequently out of order entirely. This was bad because they were made by an Italian company and every time one of them broke, a service tech would have to fly out *from Italy* to fix the piece of crap. One of them was once down for almost two weeks because they didn't have the part it needed. It was quite common for people to get stuck in them and have to be let out by the guards. It happened to me several times.

They worked by having a convex sliding door on each side of a floor to ceiling Plexiglas cylinder. Inside it a card reader for your badge and a biometric reader that you put your hand on. If both match up (and the fscking thing doesn't break!) the door on the other side opens. Both doors cannot be open at once, so you have to wait for the first door to close before it even lets you wave your badge at it and have your palm read.

You couldn't steal anything much via one of those, since even getting a 2U server through the tiny things required holding it between your legs. Otherwise it would think two people were in there at once and refuse to open the other door. Anytime it doesn't open, whether by design or all-too-common breakdown, it sets off an alarm at the guard station and the guards have to come and let you out.

If the things are broken, the guards can open an alarmed door (also used to take large piles of gear into the data center and go up the freight elevator), and no one could steal anything that way either because they can see anything you've got. You have to fill out paperwork on anything equipment your are bringing in or taking out, with description and serial number. There's an audit trail on anything anyone does - even employees - in the colo space.

After you get out of the transporter, if your atoms haven't been scattered halfway across the universe, you go to a secure elevator which again requires your badge to operate. It will go only to the floor your badge is authorized for, and after you get out of the elevator you have to use your badge to also open a door.

Then, you finally get to your cage, which you can open with the key your signed out from the guard station when you checked in. At the guard station, you need to be on the authorized list for your company, and you need photo ID or you don't get in.

It was pretty safe and pretty secure.

I have also been inside one of the data centers of a large, well-known investment bank. It was far less secure than the colo data center where I worked. For starters, all I had to do to get in was be in the company of a senior sysadmin who worked there and had 24 x 7 building access. He signed me in at the door to the building (which was not a dedicated data center; it was a regional headquarters in which the data center was housed), and I didn't even have to show ID. And it was on a *weekend* when the place was deserted. There were no security checks after that. None. He just swiped his badge at the computer room door and took me in for a tour. I never should have been allowed in there at all, let alone with no one even checking to see who I was or anything.

Granted, this regional headquarters was not in the United States, nor is it a US bank, so they may have different regulations, but I'd be surprised indeed if there were any regulations stating that a bank cannot use a colo facility. I also used to work for an American bank, in its main data center. The security was a lot better than what I just described above, but not as good (not even close!) as at the security at the colo data center where I worked. The network connections to that place also went by not one, not two, but *five* different carriers. It wou
Re:Another completely different approach by Anonymous Coward · 2004-07-15 11:47 · Score: 0

So how do you maintain the same IP when you're using a connection from 2 different providers? Not to mention most cable (and a lot of DSL) going to homes uses DHCP.
Re:Another completely different approach by mwvdlee · 2004-07-20 22:57 · Score: 1

It has little to do with actual security inside co-lo's, it has to do with regulation.

In the Netherlands, where I work, financial institutions are simply not allowed to do so by regulation.

Even so, I would be very sceptical of a colo which claims such security, they may very well have such levels but what is the guarentee, who monitors them and who's responsibility is it if something goes wrong? More importantly, will these co-lo's indemnify the company financially?

Most likely the systems in the colo were just some front-end (web-)servers. Problem is, for a bank the risk of something happening may be incredibly small, the damage if something were to happen would be incredibly high, unscheduled downtime of just a few minutes per year literally costs millions of US$, having an important server offline (including backup) for a whole day is likely to bankrupt any bank.

--
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?

Redundant should be 100% by binaryspiral · 2004-07-12 16:46 · Score: 1

Redundant should be designed as 100% redundant - count on your ISP or local telco getting bombed off the face of the planet - then plan around that.

If you don't have a CLEC or ISP - then turn to DSS sat.

For those of you who have a T1 and want a cheap backup - think about ISDN, DSL, or even a Cable internet account - it doesn't have to equal a T1 but would do in a pinch for routing mail and basic traffic.

If your boss doesn't think your company needs a redundant line - go unplug the csu/dsu for an hour and then ask again.

Re:Redundant should be 100% by Passman · 2004-07-12 17:14 · Score: 2, Interesting

If you don't have a CLEC or ISP - then turn to DSS sat.

Actually you may not have to go that far. For redundancy around here we can go to the power company. Our local power company has a networking affiliate with their own completely separate network.

Good luck trying to get the local phone company to admit of such a situation exists in your area though.

--
Minne-snow-da: Winter is comming...
Re:Redundant should be 100% by mebob · 2004-07-13 03:09 · Score: 1

I can see ISDN or DSL for P2P T1 redundancy or purely access TO the internet. Are there any DSL, or Cable internet solution for redundancy. I'm guessing that some kind of co-lo DSL solution is the only hack that might work. This of course would not be for truely big time hosting.

--
=1000101
Re:Redundant should be 100% by binaryspiral · 2004-07-13 03:53 · Score: 1

No, not for a major office or site. But for branch offices or smaller setups a DSL line would be a decent backup for a T1. Or even DSL and Cable as a backup. This is only the considering the cost though.

Having redundant T1's from different providers is always going to be the best option, but may be more expensive that many are willing to pay for.

Diverse routing by psyconaut · 2004-07-12 16:52 · Score: 3, Informative

When ordering DS1s from a telco, you generally have to specify diverse routing to get them nailed to a different CO!

-psy

single point of failure by Anonymous Coward · 2004-07-12 17:06 · Score: 2, Informative

Even though you have redudant circuts, that doesn't
mean you have redudant internet access. BGP can still fail.
That is .. a few weeks ago my company suffered about 8 hours of downtime because of a MCI fuckup/russian company advertising routes for about ~100 of ATT's customers. Our systems were up the whole time but a good deal of the internet was trying to route traffic for those networks through russia.

MCI/worldcom says it happened because a fiber was cut in ohio, which exposed a weakpoint, eventually after much escalation MCI added acls to block those routes from being advertised accross their network.

During the downtime depending on where you connected from you had up to 99% packet loss. ATT claims fully redudant OC-somethings comming into the data center(we colo with their ATT ENS branch), all it took was one little fucker on the net to advertise routes to screw us(and others) over.

not being a network guru I am not certain how the average company could defend themselves against such a problem. seems many router peering setups operate on the grounds that they trust each other to do the right thing ..

of course ATT doesn't have to say anything to the public, I think the government regulations say that the telecom companies must go public with info if it impacts more than 15,000 of their customers(forgot where I read that). In this case they claimed it affected less than 100. Perhaps only those located at the same datacenter as my company's stuff, a datacenter which appears to be more than 60% vacant.

wait by austad · 2004-07-12 17:10 · Score: 3, Informative

So, just a second...

Both your T-1's go to the same ISP? Why are you running BGP then? You aren't gaining anything from this except for added complexity. If you're going to continue with this setup, drop the BGP and bond the T-1's together.

The only reason you would want to run BGP is if you had separate links to different ISPs. This is the best way to do it when going for added redundancy. Then if one ISP has a problem, your routes only get propagated out the other link. Keep in mind that you will probably have to play around with as-path prepending and some other things to balance your traffic properly when you do this. And keep in mind that if your total bandwidth exceeds that of one T-1, when one of them does fail, you are going to saturate the other one. If you make sure you get enough bandwidth to prevent this from happening, you won't need to play around with balancing the traffic so much either.

There are a couple of companies out there that make BGP load balancing devices that will look at the load on each of your links, and make modifications accordingly. I've never used one, and have no idea how well they work. F5 I think makes one, and there was another I looked at awhile back that cost $8k or so, but I forgot the name of it now.

But, bottom line, BGP over 2 links to the same ISP is pointless unless you have a separate path to another ISP somewhere.

--
Need Free Juniper/NetScreen Support? JuniperForum

Re:wait by ebrandsberg · 2004-07-12 18:33 · Score: 1

If they have their own IP space, it would make sense. In addition, any configuration with more than one link to their provider will make use of BGP to allow the network routes to adjust properly if one went down. Most providers prefer this due to the simplicity in setting up filters for the advertised routes into their networks. I would honestly question a dual T1 setup that didn't make use of BGP, as in most cases, there is a single point of failure if it isn't used.
Re:wait by austad · 2004-07-13 03:26 · Score: 1

If they have their own IP space, then it makes a little more sense, but the provider could still advertise that for them, and the way this post was written implied that they had just gotten it for redundancy.

But, I still do not see where you're coming from. I disagree. If you have 2 t-1's to a single provider, the same IP space is being serviced across both T-1's, tell me exactly how BGP benefits you? Ideally you would bond these T-1's anyway to get a fatter pipe, so they would act as one link. But even if you didn't, you can load balance across them anyway. The only routes that are really involved with this setup is a couple of static routes, and most ISP's actually prefer to set it up this way.

I've been doing this stuff for over 6 years, and I don't see any benefit to what he's doing.

--
Need Free Juniper/NetScreen Support? JuniperForum
Re:wait by ebrandsberg · 2004-07-13 10:07 · Score: 1

Let's rule out bonding, as they are talking about going to different locations on the network. Bonding works well on two links between two routers, not for redundancy at the level being discussed. Let's also rule out static routes, as a static route will remain in place as long as the ISP side of the connection thinks the link is up. If you have a connection but for some reason can't pass traffic (it happens) then a static route won't work. BGP allows the routes to be injected in at two different points in the network, AND it can detect if the link is up but not passing traffic properly. It is as a general rule the easiest protocol to setup that guarantees to a much higher level of confidence that traffic will flow properly.
Re:wait by slashdotter78 · 2004-07-13 12:08 · Score: 1

I don't know of any device that can make BGP work better. All the BGP programmers I've talked to have said there is no way to load balance BGP lines without constantly manipulating AS number information.

austad is correct in everything he said. I don't see any reason to use BGP with only one provider. It would be much better to bond the lines. But to solve the problem where you only have one CO, you probably need to talk to the telco(s) and see if they can provide redundancy at that level. If you do use different ISPs (preferably with different backbones), then you would need to use BGP or some other multi-WAN solution.

Just so you guys know my background, I've worked at FatPipe Networks for three years now as a network engineer. They make Linux-based appliances that provide load balancing and redundancy without the use of BGP. Sometimes BGP is the right solution, but when you can't afford it or can't get your ISPs to cooperate, then you must find another solution.

As someone mentioned earlier, you can do things with DNS to get redundancy and that's what we use to get around having to do BGP. The downside with DNS as a solution is that you don't get redundancy at the IP level, but the upside is you get better inbound load balancing (as anyone who has ever worked with BGP knows all too well).
Re:wait by Shakrai · 2004-07-16 02:47 · Score: 1

Both your T-1's go to the same ISP? Why are you running BGP then?

Umm -- maybe they go to different locations? The old ISP I used to work for had two T1s with Sprintlink -- one going to the NYC POP and one going to the Pennsakauen NJ POP. I used BGP to configure them because Sprintlink would keep advertising our routes even if the link went down.

My whole theory behind it was we could load balance our different IPs to different T1s (and with BGP we had control over it -- no such control with static routes on Sprint's end) and that by having the T1s in different cities they were much less likely to die at the same time.

--
I want peace on earth and goodwill toward man.
We are the United States Government! We don't do that sort of thing.

The system at one of my previous place of employ by YankeeInExile · 2004-07-12 17:10 · Score: 3, Interesting

We had four T1s -- two from MFS and two from Bell. Of the four T1s, two (one MFS and one Bell) went to one NSP in Santa Clara, and the other two went to a different vendor in Oakland.

We even had physical plant diversity -- the Bell loops came from cable that ran along Stevens Creek Blvd, and the MFS fiber came up from the street that ran behind us. Outside of the building burning down, we were bulletproof.

Ran three years without a single minute of downtime.

My crowning glory in network design. Never again did I work for an employer who was willing to put their money where there mouth was for reliability.

--
How does the Slashdot Effect happen given that no slashdotters ever RTFA?

Wait a second... by AlphaOne · 2004-07-12 17:12 · Score: 3, Informative

Now, the only true single point of failure is the physical cabling in the street, but in CA that doesn't get damaged very often.

Let me get this straight... you're complaining that a PDU went down at your provider yet you're perfectly happy that you're running both circuits over the same cable under the street? In California?

Cables are cut all the time. Stupid things like rain water seeping through insulation take down entire city blocks. A single earthquake can disable hundreds of square miles for weeks or months.

On the other hand, you rarely hear of the type of failure you experienced. A well designed data center can take quite a lot of failure without a significant (or any) reduction in service level.

Maybe your provider is different, but all the data centers I've ever dealt with have multi-path redundant power routing systems. If a PDU goes out, another one takes over. They constantly share the load yet can easily take it over if one or more fail.

Add to that the standard AC-DC-AC power path and you've got a pretty rock-solid power distribution system.

Unless you can completely eliminate your single point of failure, you're going to be at risk for down-time. In fact, even with a completely redundant infrastructure, things have a bad habit of conspiring against you anyway.

--
All opinions presented here aren't mine.

Redun-what? by Tux2000 · 2004-07-12 17:41 · Score: 3, Funny

The IT at "my" company seems to love single points of failure. Their motto seems to be "if there is a way to build a SPoF, do it". Recent examples:

The "services office" (where IT, language service, human resources and so on work) is connected through a single line to the "main office" 10 km away. One day, an excavator cut that line. Result: No one could work for hours, because each and every device including all computers and all printers use DHCP to get an IP address. And the DHCP server (and the DNS server) is located in the main office. There was a dedicated print server, but it was not allowed to work as DHCP and DNS server.

All servers in a remote office run on a single UPS. One day, yet another evil excavator cut the power line. All rooms went dark, the UPS switched to battery, all servers were running smoothly. The PBX had and still has no UPS, so only mobile phones still worked. The hotline of the local power authoritiy told us it would take some hours to get the line fixed. So we needed to shut down the servers before the UPS battery was drained. But except for one or two servers, our IT supporter had no privileges to shut down the servers, so it had to be done from the main office. But neither the ethernet switches nor the router to the main office were connected to the UPS. We finally decided that the servers had had enough time to write their caches to the disks and simply disconnected them. And no, the UPS signal output was not connected to the servers. Now, it could signal a power outage and a low battery via ethernet -- if the switches were connected to a UPS.

Did I mention that all servers in that remote office are connected to a single switch (out of three), using up to three ethernet lines?

Did I mention various air conditions that can not cope with the heat of the servers on a hot summer day?

Did I mention that all remote office data lines (yes, one line per office) end in a single point in the main office?

Did I mention that we have a single mail server (MX for the domain) at our provider for all incoming external mail which is regularily blacklisted and that our internal MX consults that black lists to fight spam?

(Hmm, I should really stop here or I won't finish until tomorrow.)

Tux2000

--
Denken hilft.

Re:Redun-what? by Anonymous Coward · 2004-07-13 09:17 · Score: 0

Did I mention that I read slashdot regularly?

Yours Truly,
The Manager

P.S. You're fired!
Re:Redun-what? by Anonymous Coward · 2004-07-13 14:49 · Score: 0

Sounds like my company! And we are a Fortune 100 Electrical Utility, BTW.

Redundant T1s working for me by darkone · 2004-07-12 17:42 · Score: 1

Up here in New England my company hooked up with an ISP (LightShip) who has provided me with 3 T1's, to 2 physical COs at no additional cost, and it is costing us LESS than our previous ISP. We mentioned redundancy to the Sales guy who got an Engineer on the line and mapped out their network, and the 2 different ways our Cisco router could be setup.
I only have one Cisco, and I know the copper shares some of the same poles, but a month after swapping, two of my T1s went down for 10 minutes when something happened at one of the COs ATM switches, but that other T1 kept me going. For us (a small non-profit) this is more than enough redundancy.

--
-=Down Syndrome in Maine

After some thought... by BrynM · 2004-07-12 17:52 · Score: 3, Informative

I was thinking about your situation some more... Would it be too much to just move some of your redundancy off-site? If it's server availability to the internet that you need, it can be done with some work. Rent some space at a colo and put your stuff up there. Traceroute the connection from your current site to whatever colos you check out _and_ as them about their upstream provider.

If it's feeding a customer service center or a bunch of bratty executives or something, well... your fucked ;) never mind what I said.

--
US Democracy:The best person for the job (among These pre-selected choices...)

Physical redundancy by crmartin · 2004-07-12 18:04 · Score: 3, Funny

I did a bunch of Wall Street work some years ago; we had an experience with this. The system was set up with two high-bandwidth redundant paths, leased from two big providers. (MCI and someone else, I don't remember.)

When WorldCom merged with MCI, then bought the other provider, no one thought much of it. Until a trenching machine trenched across one of the big trunks ... and we found out that the physically redundant lines had been consolidated into the same trunk.

Got you beat! by ebrandsberg · 2004-07-12 18:23 · Score: 2, Interesting

I used to work as a network engineer for an unnamed company, and we had a redundant set of connections connecting Seattle with Chichago, and to San Jose, then from San Jose to LA, down to somewhere in Texas, and up to Virginia, up to NYC, then back to Chicago. There was a backhaul incident in Texas, and the Chicago to Seattle connection went down AND the Texas to LA went down. Go figure.

It is well known that even if at any given time you are making use of different sonet rings, circuits get shifted around based on demand, and you could end up being rerouted onto the same circuits without any notice. They only way to know is to wait till a problem occurs, and see if it impacts more than one connection.

The one thing missing.... by Dark+Nexus · 2004-07-12 18:23 · Score: 4, Informative

The BACKBONE. If your provider only uses one backbone, there's still a choke point. If the backbone goes down, for whatever reason (it can happen, and has happened), you've got the same effect as being redundant at your end but not at theirs... "theirs" is just further down the line.

There are providers that have multiple backbones, from different providers. I worked for an ISP that at the time had 4 different backbone providers. While there, I saw one of the backbones fail, stay down for several days because the backbone provider dragged their feet in fixing it. Everything else kept working, though, and the only difference was that during absolute peak useage, servers were very slightly slower in responding due to the missing bandwidth.

Being redundant between you and your provider isn't enough... ask if your provider's connection is redundant as well.

--
Dark Nexus
"Sanity is calming, but madness is more interesting."

Re:The one thing missing.... by Anonymous Coward · 2004-07-14 04:53 · Score: 0

Great point. I was in such a colocation center w/ 4 backbone providers; and even apart from being totally "down" they can have serious problems.
In this case, each of UUNet and Exodus (they were big at the time)'s network around San Jose worked fine at first glance, but they had *huge* (few second) latencences betwean each other for days (weeks). Both just pointed fingers at the other. Being peered to both was to only way to satisfy customers in each.
And then we had a similar problem with part of Sprint's network two years ago; where the network worked great, but with 30 second(!!!) ping times, and zero packets lost(!!!) going from SF to Boston in Sprint's network. Throughput was unaffected (for our T1 lines), so they were doing some serious buffering. 30Sec * 1Mb/sec = 30Mb they buffered just for our connection while we had it saturated.

SBC Served? by krangomatik · 2004-07-12 18:45 · Score: 2, Insightful

If you're in CA I'm guessing that SBC (Pacbell/whatever you know them as) is the local telco that provides the fiber service to your prem. I think you should be able to get diverse pathing from them. It will cost you some $$$, but is sounds like your organization is willing to pay for redundancy. They should be willing to do diverse pathing to your local CO, or diverse pathing to separate COs. You ought to be able to get strands going out of two separate conduits from your building, and completely separate conduits all the way to your local CO, or another nearby CO. You could have a CO SONET node in your closeset CO as well as a CO SONET node in a nearby CO and feed to your upstream provider from there (dunno if your upstream is PBI, which should definately do this, or another provider, who should as well). That way you can set up a healing SONET ring that will survive (in theory) a fiber cut (yes, they do happen. Even in our lovely CA :P ) or a CO outage (as long as your upstream can feed you from both COs). If you have a large enough netblock you should be able to get a connection from a second Internet provider and run BGP with them. Your problem then will be summarization at close by peering point, which is a complexity that you can get around (at a $$$ cost, of course). Just be aware that CO failures, cable cuts, and peering point failures all do happen, but you can always minimize or mostly eliminate if your organization is willing to make a dollar committment to it.

For the record, I am not an expert on this, but I have a bit of experience under my belt.

cheapest onsite redundancy? by dan_bethe · 2004-07-12 18:58 · Score: 1

Have you guys done onsite link redundancy, but with a cheap circuit as a backup? Such as by getting a $50/mo cable connection as a failover for your T1? Does that require ISP support on both sides?

I should probably keep researching Zebra and lartc and stuff.

Re:cheapest onsite redundancy? by dfranks · 2004-07-12 20:04 · Score: 1

Fine for outbound connections, but I doubt that any cable provider is going to support BGP or OSPF to allow you to maintain the same destination IP address on both links.
Re:cheapest onsite redundancy? by dan_bethe · 2004-07-12 20:17 · Score: 1
I'm only talking about cheap dual-ISP redundancy for hot failover only, not for load balancing. It'd work fine for outbound connections from our downstream customers to the Internet, but it wouldn't work for our local server hosting. For the latter, as far as I know, the two choices are either BGP or DNS-based failover.
DNS based failover isn't a good option for servers because of these reasons:
- because a T1 is not likely to be down for long and all the stupid Windows clients on the public internet that don't properly respect TTL will not respect either the failover update or the restoration update within that period of time
- I'd inherently need to host DNS offsite in order to propagate the new server IP addresses upon failover
As I understand it, BGP must be supported by the upstream ISP in order to work at all. Right?
Re:cheapest onsite redundancy? by slashdotter78 · 2004-07-14 04:18 · Score: 1

Yes, BGP must be supported by the upstream ISP in order to work. I'm an engineer at FatPipe Networks and we sell a Linux-based appliance that provides redundancy and load balancing without the use of BGP. And yes, we do it using DNS, but our box is intelligent in that it will detect line failures and change the way it answers DNS. The DNS runs on our box (we're running BIND).

Redundancy needs verification by digitaleopard · 2004-07-12 19:14 · Score: 2, Interesting

Well, I have a friend who is the network admin for her company and she experienced EXACTLY the same problem you did - "redundant" T1 lines running to the same CO; the gear in the CO went down (The ILEC steadfastly refuses to give details) and they lost connectivity. For 10+ hours.

Real redundancy costs real money.

I work in a professional Colo facility in Denver, and we are fully redundant in all systems. Once it leaves your box, there's two of everything. Dual power to the box, dual network connections (Backbone: Dual OC-12 lines from different providers, running to different boxes in the POP) Dual climate control systems, dual generator rooms with independent fire control systems...I could go on, but you get the idea. I'm on the graveyard shift, and things run so smoothly I get a lot of reading done. And with 150k square feet of building, a lot of walking as well.

It's not cheap. But if you really need redundancy, it's cheaper to rent space in professional facility than it is to try and be compliant without one.

Different mediums by magefile · 2004-07-12 20:54 · Score: 1

I work for a small (less than 20 people) company that's branching into web stuff. Their old tech guy still does a lot of "consulting" work, since the company he now works for is often hired by this company for non-IT stuff. So we have 2 T1s to our servers in [city number 1], plus a cable connection or two. If by some weird chance the cable is out when the T-1s fail (thank you Comcast :( ), then we can migrate everything to this guys company, which has 2-3 T1s in a city about an hour's drive away. So, 2 different mediums for in-town, and a third backup that is not only with a different company, but in (obviously) another set of cables (i.e., not subleased by Company1).

re: Redundant Internet Access? by manavendra · 2004-07-12 21:02 · Score: 2, Informative

I don't know much about this, except that in the company I used to own, we had two *separate* T1 lines from two *separate* ISP's.

And that is why we called it redundant lines - ensuring if one fails the other would be able to keep us alive.

I never trusted any ISP about their claims on redundancy. Perhaps it's the competition in the business segment they are into, or whatever else, I've found ISP's to :

1. Ignorant about basic concepts, or at least not in sync with customers about them
2. Lying. As harsh as this word is, I've had numerous instances where it was later discovered that they were fibbing

--
http://efil.blogspot.com/

Re: Redundant Internet Access? by TheLink · 2004-07-12 21:12 · Score: 2, Funny

"except that in the company I used to own, we had two *separate* T1 lines from two *separate* ISP's"

Hah! You call that redundancy? Real redundancy is when you own TWO companies doing the same thing. :)

--

Too many replies beneath your current threshold

TowerStream by shadowxtc · 2004-07-12 22:44 · Score: 4, Interesting

If you live in any of the following areas...

# Chicago, IL
# New York, NY # Greater Boston, MA
# Greater Providence, RI # Newport, RI
# Westerly, RI

TowerStream may be something to look into. I use them as our primary connection at the office - they are far cheaper than a traditional T1 ($350/mo for 512k, $500 for 1.5mbit, they can handle around 5GBit max I believe).

True line-of-site is not required, a reflected signal is usually sufficient. An external flat-panel antenna about 6 inches tall and wide is required, however. With ours setup on the roof, we get 0% packet loss, and have had no problems through heavy snow, rain or thunderstorms.

I have occasionally had connection issues, where the wireless modem has needed to be power-cycled. I suspect, however, this is simply due to it overheating :).

Joking and Seriously by 4of12 · 2004-07-13 00:44 · Score: 3, Interesting

if you want to find out about "redundancy" find out what they do in the military.

Cost is another matter....

--
"Provided by the management for your protection."

Yes Yes by Hard_Code · 2004-07-13 01:27 · Score: 1, Funny

of of course course my my circuits circuits are are redundant redundant

--

It's 10 PM. Do you know if you're un-American?

Longer term! by JC_England · 2004-07-13 02:00 · Score: 1

I have similar experiences of things that weren't as redundant as claimed. One other point to consider is that you will need to set up a PROCESS to check this redundancy at least once per year. Otherwise some maintenance will occur which "optimises the service" - reduces cost for the operator by rerouting on a cheaper connection. This will very likely end up in the same duct, if not on the same power and router kit as the primary. First you'll know is when the digging machine down the street takes the whole site out.

Another concern - for those of us in Toronto last year when the lights went out - is the frequency with which UPS fail. Again - test regularly, rehearse regularly...

True Redundancy is Best Achived... by haplo21112 · 2004-07-13 02:52 · Score: 0

...by sourcing from different providers.

Of Instance one T1 from AT&T and one from MCI.

--
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.

Re:True Redundancy is Best Achived... by Anonymous Coward · 2004-07-13 05:21 · Score: 0

Agreed.
At a previous job, we used our disaster recovery site as our 2nd connection to every location we had, including Internet access. We had carrier diversity, location diversity (different cities) and we were sure to beat up the vendors for their city demarq, fibre and cable diagrams (they normally never share these, but you can beat it out of them). You should make sure that your Primary/Secondary location connectivity isn't running on any of the same paths as your other circuits.

Diversity is the key!

The problem is the ILEC. by oneiros27 · 2004-07-13 03:49 · Score: 4, Informative

No matter who you order from someone has to do the last mile (aka, local loop). Typically, that's the Incumbent Local Exchange Carrier (ILEC), which is normally one of the baby-bells, or whatever they've become since they've started merging back together.

You might get a line from Sprint that goes through Chicago, and another from MCI that comes from Dallas, but when they get to your town, they hand it off to the ILEC, who runs the last mile.

Even if it was hooked up to a different switch, or was terminaed at a different CO, you still have redundancy problems -- odds are, the lines come into your building at a fixed point, which could be hit by a backhoe.

I know of an ISP that was serviced directly by a CLEC (the city-run cable company pulled fibre to them, besides the copper run from the ILEC...) but they were run on the same poles, so it didn't matter.

The only really redundant systems I know of didn't use wires for one of the components. Typically, they had lines pulled to two different places, through two different COs (in once case, in bordering states, that were on different power grids), and then connected the two with microwave. This way, the second leg completely avoided the ILEC.

It's not cheap, but well, redundancy doesn't tend to be.

In the long run, you have to look at what the costs are going to be, and what sort of losses it's going to prevent, and if the additional benefits are going to outweigh the cost.

Oh -- and typically, even if a CLEC (competitive local exchange carrier) has their own switch, the last mile is still typically handled through the ILEC, which puts you back in the same boat. Even with DSL, it doesn't matter if there are two different DSLAMs, if they're routed through the same CO or SLIC.

--
Build it, and they will come^Hplain.

Re:The problem is the ILEC. by perlchild · 2004-07-13 09:04 · Score: 1

ILEC phaw! I'm talking about Ethernet-over-Fiber, not ILEC-provided-circuits.

I guess I should have written a two page article about things that weren't my point, so people who wanted to ignore the routing aspect of my post didn't have the excuse.
Re:The problem is the ILEC. by innosent · 2004-07-13 15:54 · Score: 1

Or, to make things simpler... If you really NEED true redundancy, having things in the same building is a problem. So either buy rack space at a colo, or run your redundant setup to another physical location, serviced by a different CO, preferably owned by a different company. Of course, then you need to be redundant as well, since you'll have to be in both places at once. (In other words, this solution requires two buildings, two carriers, and [at least] two sysadmins.) However, this may still be cheaper than actually having a long distance copper/fibre/microwave setup from a second CO, assuming that your company already has another building somewhere, and another sysadmin available there. But why go to the trouble of having redundant systems if you put them in the same place? One fire/lightning strike/power outage/flood/construction worker/etc. can take both links/systems down just as easily as one.

--
--That's the point of being root, you can do anything you want, even if it's stupid.

Score +1, Sad by Thng · 2004-07-13 05:07 · Score: 1

Sounds like the IT architects need some glass stomachs

Earthquakes? by Marxist+Hacker+42 · 2004-07-13 05:16 · Score: 1

Now, the only true single point of failure is the physical cabling in the street, but in CA that doesn't get damaged very often.

I'm not sure if a T1 is copper or fiber- but I doubt very much that either could withstand a separation and dislocation of concrete.

--
SJW: a person who perceives an injustice, and while correcting it, commits a greater injustice.

Re:Earthquakes? by Secahtah · 2004-07-22 15:45 · Score: 1

We have several T-1s where I work. They each consist of 2-pair copper lines.
Re:Earthquakes? by Marxist+Hacker+42 · 2004-07-26 05:07 · Score: 1

What gauge are the copper lines? Do you think they could survive, say, a fault line sheering off of even 1"? My point was California is not exactly the safest place to try to build a guaranteed up-time network center; I'm not sure where is but I'd at least pick someplace relatively geologically stable.

--
SJW: a person who perceives an injustice, and while correcting it, commits a greater injustice.

we thought we were redundant by Beaker1 · 2004-07-13 06:23 · Score: 1

Of course almost nothing can stop a guy with a backhoe from killing it when the fibre from two different companies runs down the same path under the street... :(

--
"Who hasn't slipped into the break room for a quick nibble on a love Newton before?" - Mr. Peterman.

Backup lines by raider_red · 2004-07-13 06:31 · Score: 1

One company I used to work for used a commercial Road Runner connection to back up our corporate T3 line. It worked pretty well. One time the T3 line went down for two days and no one noticed unless they were pushing large files out to our remote sites.

--
It's good to use your head, but not as a battering ram.

Local Echo by raygundan · 2004-07-13 08:00 · Score: 1

No, that just means you left local echo enabled.

Odds are You Can't by bill_mcgonigle · 2004-07-13 08:57 · Score: 0, Troll

If you know enough to get diverse lines from two CO's when you buy circuits nothing says they will stay that way. You have to constantly re-prove it at the best interval you can handle.

If you get diverse paths to multiple CO's those CO's may share a common backhaul to the next more metropolitan area.

In most locations you can't get lines from anybody but the local telco, and all the lines run together.

In most locations if you can get lines from different providers they run along the same poles.

Most companies (small) can't get a big enough address block to get a route.

Many ISP's won't cooperate to "help each other" for you to use the BGP route if you have a big enough company to get a network. If there are only a few ISP's in your area this is even more true.

You need to at least run lines out different ends of your buildings, preferably you should have separate buildings with different power, etc. Then if the regional power goes out you need big generators at both buildings.

See, it's super expensive to actually get real redundancy. Try turning the problem around.

Rent some server space at different data centers in different areas of the country. Use a round-robin DNS or better. Take advantage of the new fast-updates to the .COM and .NET zones. Think of your office as a leaf node.

It's cheaper to pay for the bandwidth back to your office than it is to go for redundancy there.

--
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)

78 comments