Why PayPal Chose OpenStack
AlbanX writes in with this story about Paypal's use of OpenStack. "PayPal's IT team has taken control of its technology release cycle by shifting key components of its IT infrastructure onto OpenStack. For PayPal, the decision to use components of OpenStack was based around speed to market. It allows the payments provider to untether its release cycle from those of vendor partners. 'PayPal has not historically been known for its fast reactions,' PayPal senior engineer Scott Carlson conceded to attendees at the VMworld conference in San Francisco this week. 'It has taken us six to nine months sometimes to react to our competitors.'"
No articles on Credit Suisse's decision not to use OpenStack and their rationale for not doing so? On my own head be it, I suppose....
Seriously man. It's 2013. Open a new tab and type "Openstack".
How lazy can you get?
It's a collection of open source API's for the management and automation of virtual machines at scale. It can be used with a number of hypervisors including VMWare vsphere, Xen, KVM, Hyper-V and a few others.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
They have a single business, with a relatively constant, yet probably growing number of users. If they need 100 servers today, next year it will be 120. They don't need massive content or application distribution, they don't need to rush new products to market, they don't need all that cloudy stuff. If "OpenStack" just means they virtualized some old iron that's another story, but a far less interesting one.
I want to delete my account but Slashdot doesn't allow it.
This is a very complex series of decisions, and it's not really easy or possible to say, "Well, we didn't decide to do Openstack because VMware is better"
He flat out said that he won't add another layer of complexity, requiring the increase his true cost of labor, because his present system is orders of magnitude cheaper! He also said that they challenge this assertion and third party cloudy computing and continue to prove it correct, at least for them, ever 12 to 18 months.
Orders of magnitude cheaper on a $1billion annual spend.
I can give you shitload of reasons as to why you should not use PayPal !!!!
Those involved in Virtualisation probably (or should have) known this anyhow.
The Hypervisor war is done. Pretty much everyone (VMware, MS, Citrix) have their new cloud based offerings that are agnostic towards the hypervisor that runs on the tin. If you have played with vClould Automation Center for example, there are multiple options for the hypervisor types including Citrix. The bottom line is there is not much more to add to to hyervisor and there is also less money in the hypervisor. It is an old (mature?) technology.
The new hot button is the tools to manage the infrastructure and that is where the real war is going to be won or lost.
http://www.writeitfor.us - Writing IT for the IT generation.
Why would anyone ever want to have large scale virtual machines, other than web service providers, that sell you a virtual server to do with what you will?
If you want to manage your company, and your users, and perform calculations, and transfer funds, adding on a whole new layer of overhead has to be the stupidest idea I have ever heard of.
Troll is not a replacement for I disagree.
In the paypal use case they could use it to take an internal image and burst it quickly into a cloud provider to scale up their capacity as they see demand spiking beyond what their internal resources can accommodate (not saying they are doing this, just that it's a possible application). For a typical enterprise it's useful to allow on demand lab creation for developers, snapshot the current production machines and generate an isolated sandbox that accurately mirrors the production environment. Automated unit testing is another popular use of API driven provisioning. If you can't find a use for automation in your environment it's either too small to qualify (I'm in this boat, we're fairly big at over 300 VM's at our main site but still too small for much automation) or you're not thinking outside your current box.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Not if you want things to scale easily. I think a good example is SOE. They have a bunch of MMOs and years ago they virtualized their servers. Now it's completely irrelevant how many players are playing any particular game they have. If they have even 1 paying customer a limited amount of resources is dedicated to the server side of that players gameplay. If the population suddenly shoots up to 100,000 it just scales up assets dedicated to that game. It's brilliant really and is why SOE has been able to keep decades old games going for so long. None of their hardware is application specific.
In a non-virtualized environment once the player population fell bellow a few thousand it would not longer be profitable for them to keep the game running and they've have to shut it down... and lose all those customers.
you ask a valid question, and normally i would respond, but throw a backhand in and, nope! but your assertion is quite ignorant.
Because many companies have many products and development efforts going on at any given time. Because many companies have weeks- or months-long provisioning cycles for new physical servers.
Because as the summary says, having a platform allowing for rapid (consistent) deployment of infrastructure like development, test, and production servers allows companies faster time to market on their major development efforts.
Because as you may have heard, software-defined infrastructure allows you to leverage automation to a much greater extent than with physical servers, allowing you to literally provision 100 identically configured servers in a couple hours for a server farm with just a couple commands.
Because if you can't see the benefit in allowing that, you aren't qualified to express an opinion about the wisdom of the approach.
It's to do with flexibility. Maybe I have services that can't co-exist but their busy period doesn't overlap. Maybe I have some servers that are rarely used so they can all take advantage of an over-committed physical CPU. Maybe I have lots of development environments where performance is hugely important, or I just need the ability to fire up another server for testing purposes, maybe for a matter of hours, days or weeks and buying in new hardware isn't justified.
In the main? Generalising hardware, making it a commodity, especially in the Windows world - having multiple redundant systems quickly gets to be a pain in the arse once you stop having identical hardware, you can't simply move a system to a larger hardware platform because that brings with it a lot of caveats in terms of non-identical hardware (does it perform the same, have the same overheads, have the same cascade effects etc).
Moving to an internally virtualised environment standardises the hardware aspect, or rather it moves it's concern to that of the virtualisation layer - to the guest, it looks the same regardless. Makes having redundant systems much easier, makes having a production-identical UAT environment much easier, and it makes having an offsite business continuity environment much easier - build the environment once, deploy it many times in isolation.
SO basically VMs do the job that OSes are supposed to do?
Troll is not a replacement for I disagree.
That is all there is to it. Who cares what tech or software they are using when they are the best example of how to be a ripoff bank on the net. Their policies and politically minded actions against the people are all that matter. They suck and no one should utilize them...ever. No AC here, mod me down all day till I have negative karma, I don't care. Paypal is shite and everybody knows it, some people make a little money using that sham of a financial institution and the money drives their thoughts.
Now for the real question...
How does virtual machine automation help paypal's release cycle???
That part doesn't make sense to me. It has the potential to save them money and better manage and trend their VM environment, but... faster reaction times? What they can't clone a VM in a timely manner? It certainly won't write code for them, oh well, IT buzzwords ftw.
Not really, there's a world of difference between the two. It's really a case of bringing portability, redundancy and scalability to a whole raft of applications which didn't have it before, and simplifying it dramatically in the process.
I don't think any OS below the million dollar mark has supported moving running processes and environments between actual independent hardware nodes with little or no interruption, and now you can have that with a free OS and a little configuration.
No longer do you have to wait for your vendor to send over a part in order to stop a critical system running in degraded mode - just allocate that VM to a sdifferent node in your cloud cluster, if it wasn't done automatically for you...
i think the real story here is that a large business actually realized the benefits of open source are greater than the ability to play the blame game by paying for vendor support.
then again, maybe grumpy cat is really in charge of IT.
Anons need not reply. Questions end with a question mark.
All of these reasons are correct. Plus there's one intangible benefit you missed.
Developers need quick access to OS images and a software stack representative of what might be on a given production server. Now you can spin up whatever OS you like and install whatever software you need in less time than it takes to download an ISO.
The production environment must be able to respond quickly and elastically to spikes in demand for services. Now it can.
Anyone can custom-configure a VM with appropriate hardware requirements for the application(s) running on the VM. If you don't need 8 GBs of RAM, set the VM to use 4 GBs. If you only need 20 GBs of disk, you don't need to use 80 GBs and waste resources. But if you need 32 GBs RAM and a terabyte of storage (or more), you can have it (within reason) and you can have it in 5 minutes.
One intangible benefit of the above that didn't get mentioned is that now access to a hefty server (one or 25 of them, whatever you need) is available to anyone without getting pushback from management or IT, thereby "democratizing" IT functions. Sweet freedom!
If you think this is purely a PayPal thingy, think again. It's a strategic company-wide initiative. All of eBay and its myriad divisions now have these capabilities. I hope eBay's competitors quake in their boots.
This setup is pretty cool and much appreciated.
Many Bothans died to bring you this information. I'm a Bothan, and I want to live. You didn't hear any of this from me. I'm an anonymous coward for a reason. :)
really .. tell us how you feel about it.
Did they decide on OpenStack via wrestling match?
This is a good opportunity to get a nice list of Paypal Competitors. I have no clue. Any suggestions appreciated.
Free Road Rage Reducer
I agree. When I read about their 15 minute rule I was like: "Well that's nice, but are deployments really where they are losing time in reacting to competitors?"
Having been involved in production deployments for over 15 years now, I can tell you that deployment time is pretty much a non-issue with speed to market. Things like VMs and storage options and all of that make for easier, and less risky deployments, I agree. However, I have never heard of a piece of software that was ready for prod that was waiting for more than a simple scheduled maintenance window in order for it to go out. Maybe some of the more complex deployments might have required a week at most to practice and get the scripts right. That's it.
Compare that to the weeks and months that marketing, product management, development, QA testers are working on features and it is insignificant.
When I do a deployment, I'm perfectly willing to do 8 hours of prep work to make sure that my customers could get 5 minutes less downtime, and preferably no downtime. Rushing things into production from test is like spending your R&D budget on improving fuel efficiency from 50 to 51 MPG while you spend nothing on getting 12 MPG trucks up to a respectable 18 or something. Where is the real time savings?
If this is Paypal's solution by itself, I predict that they will remain behind the curve going forward as well. Maybe they should figure out how to improve their development process?
The most important part of this post is that he mentions that Paypal have competitors!
Please, I would love to know more about these!
Compare that to the weeks and months that marketing, product management, development, QA testers are working on features and it is insignificant.
Where is all of this development and QA testing going on? The point of a cloud is to build and automated image that can be deployed to any of your environments. Presumably you are architecting your applications so that they scale as well which the cloud is excellent for.
For example the dev team is done with their changes and want to push from the dev environment to UAT. Say you have two instances in UAT. instead of shutting down, updating your code, starting up each machine one at a time you just deploy 2 new instances. The deploy script fires off a test or two to make sure that the app started successfully. If you are doing something web related it hits the API on the load balancer to add the new nodes in, maybe fires off a couple of tests against the VIP, removes the old nodes, a couple more tests against the VIP, and just like that your deployment is complete. If you are doing something messaging related where your services communicate over a message bus of some kind presumably your apps just connect themselves. QA can then test, and determine if the code is ready to promote to production.
Of course this all sounds a bit over complicated for a UAT environment, but once you have every single piece of the puzzle automated that is when things get interesting. Lets say that your source control allows you to mark your projects as "ready for UAT", and allows your QAs to mark them as ready for production. You can just have your entire test system rebuild itself every day and production releases once a week through the same process. Also since you are in a position to scale either through your load balancer or a messaging bus you can use the same processes to scale.
Say your monitoring service watches CPU usage on your web app servers, and as soon as you hit 75% it fires up another instance. If load continues to come in after the new node is online then fire up another one. If the CPU drops below what is required to run on n-1 nodes then hit that load balancer API take a node out and destroy it.
Even if you don't need to scale doing deployments often means shutting down apps on one node at a time. If you design everything for n+1 nodes that means that during your deployments you are still running fine, but you are "at risk"
From a sysadmin perspective this is all great because a machine going down means nothing. Your monitoring services see that the VMs are offline, and they fire up another instance wherever there is extra CPU cycles. Without a "cloud" this is easy to do with VMware as long as you have shared storage, but do you trust the VM? It died while the system was running what if something was corrupted? With the cloud you don't care about the old instance, just let it die. Create a new one instead.
Everyone loves VMs due to live migration but why not take that one step further to save yourself some money while you are at it. Skip the extra cards/ports/networks/switches/cables for iSCSI/FC, skip the expensive SAN and pack the hosts with some cheap SSDs. You don't need to migrate when you can just redeploy (although many of the hypervisors support moving images without shared storage now). This way you get local read/write speeds which will probably beat any SAN that you don't have to sell your first born to buy.
Backups also become much easier. Backup the images (which won't change often), and the source control systems, and you are pretty much set (maybe a couple of 1 off monitoring and deployment servers). Why bother with expensive dedup/compression appliances when you just have a few images to backup. DBs are an obvious exception there.
Speaking of DBs lets say you want to rebase one of your lower environments off of a higher one. Lets say you want to refresh the dev DBs with the test DB data. You just hit the api and take a snapshot of the test DB data image. Clone the that snapshot to a new image, shutdown your DB node, unattach the old image, reattach the clone, and start it back up. All of which can be an automated process. No pestering the DBAs to do it for you.
A couple of points for you:
Adding VMs to the development cycle is just adding an unnecessary level of complexity. Need a new environment? Use an existing web server to set up a new site, they're designed for this and it's the proper layer to solve that at.
Also, scalable apps work via shared services, this is on top of the VM layer, so again we just don't care.
There are advantages to rapidly deploying VMs... and thus something like VMWare has templated VMs to make the process 15 minutes, my main point is that none of those advantages have anything to do with the development cycle whatsoever... remember the only significant portion of paypal's IT infrastructure is their website through which they do 100% of their business, so when we talk about increasing efficiency and reaction time for paypal, things like business intelligence and improving the development cycle come to mind... VM automation is well outside of that and won't help them add features faster to their site.