Session Management and Mega-Proxies?
chicagothad asks: "I help
to run large internet systems for a few Fortune 500 companies. We are
running several clustered systems, comprised of both Microsoft and Linux
technologies. We have run into several problems with what is known as
a 'mega-proxy'. A mega proxy is a way that large internet providers
distribute their outbound traffic via a pool of IPs. AOL/Compuserve
is the largest example of this. We are having fits with session
management right now. Does anyone have any ideas on the best system
structure or design to manage these beasts or any other tips that may
be helpful?"
Usually the best way is to hold the session key in the URL the user's using.
Care about electronic freedom? Consider donating to the EFF!
Most web app enviroments provide this functionality either built in, or through a library, php and perl both do, and im fairly sure IIS/asp does.
this is one of those problems that has been solved a thousand times before, and chances are, someone else has done a better job of solving it than you will on your first try.
-- free as in swatantryam - not soujanyam.
What kind of problems are you having? Because you mentioned large organizations, I'm assuming you are talking about problems with large-volume web server farms and traditional session management techniques.
Basically, the problem is such:
Sessions are usually stored in RAM. Thus, the session only exists on one web server even if there are lots of web servers. To make sure that the right webserver gets the traffic, the client IP, destination IP, and (sometimes) destination port are hashed together to determine which server to go to. Because the hash is deterministic between requests, this method insures that if a user hits Box A, he will continue to hit Box A, provided these things do not change (that is - source IP and destination IP/port).
The problem with the mega proxies (and lots of other forms of NAT where there are multiple outgoing IPs) is that the source IP does change. Thus, the hashing technique described above fails. Cisco Local Directors amoung others do exactly this.
The solution I've implemented basically keeps the session information in RAM, although it does this through a middle-layer. If I get a session ID from a browser but can not find that session ID in my RAM, I put a querry out to the server farm network and ask, "Who has this?" Whoever has it transfers the session to me (transfer, NOT copy, as I only want one authorative copy).
You have to be careful of concurrancy issues while doing this, but if you are careful, it will work well and be extremely fast for the majority of users, as they remain at one IP for the duration of thier session. But it allows the possibility of a session migrating.
Another option is to use a central "session repository" like a database or special application server. These are almost always going to be bottlenecks, though.
I will say that this is not uncharted territory. The solutions to these kinds of problems are well known. If you are dealing with Fortune 500 companies, make sure your project is funded well enough to bring in some as consultants... This is a fundamental issue to get right, and if you have problems here, I suspect you'll encounter some problems later.
I push buttons on computers at several Fortune 500 companies for a living, and something about the "MegaProxy" gizmo doesn't work. Unfortunately, the $bignum that they pay me isn't sufficient for me to think of my own solutions.
I think they run Microsoft or Linux or something.
Can someone help?
Oh. AOL does something similar, I hear. Does anyone know how they make that work?
Please help.
Thanks, Bob.
-
Kid-proof tablet..
Look at the former rogers@home and don't do it
What we see depends on mainly what we look for. -- John Lubbock Now search for that bug slave!
Putting my medium-sized-mega-proxy-admin hat on (some 120 million reqs/day), I'd suggest that you don't use the source IP as part of your session key generation or implementation. Things break when remote hosts depend on the source IP remaining constant.
/. do session management? ;-)
With more and more ISPs implementing proxy clusters and intercepting outbound traffic, this is a problem that is going to grow, rather than one which will go away.
If you insist on using the source IP as part of your session management, you might want to look at the HTTP_X_FORWARDED_FOR and HTTP_VIA headers, as these sometimes allow you to determine the original IP of the client before it hit the proxy. However, it's not failsafe - some ISPs anonymise the IP, and some proxies don't provide the headers. Where this information is provided, it is generally concatenated - so you can get the detail even where the user is passing through multiple proxies.
How does
Without knowing something so basic to web site operation?
uses cookies. much simpler to simply use a non permanent RAM based cookie and deal with it that way.
Why should we (the intelligent users) tell you (the overpaid 'consultant') how to put together a good session-based system ? The information is readily available on the web for anyone to read, and it's not even that hard to find, assuming you know how to use a search engine such as Google. The solution is a hybrid of simple techniques which make up for each others' weaknesses. Just use that Fortune-500 brain you've been neglecting all these years.
-Billco, Fnarg.com
"Maintaining Session State on Your Web Farm"
9 /h tml/session1099.asp
by Marco Tabini
IIS and ASP provide several methods to track a user's session on the server. But when you have several servers running concurrently, you have to modify your approach.
URL :
http://msdn.microsoft.com/library/en-us/dnmind9
This suggests one way of doing it from a few years ago, which isn't to say it's the right way, but it should give you some ideas...
Ladies of the Road (KK):
Get your FREE desktop E-Book in JAVA
Rwe obliged 2 save our future by choosing:O3 hole-greenhouse effect instead of accepting everydays gossip-nonsense chat?