Slashdot Mirror


Ask Slashdot: Building a Web App Scalable To Hundreds of Thousand of Users?

AleX122 writes "I have an idea for a web app. Things I know: I am not the first person with a brilliant idea. Many others 'inventors' failed and it may happen to me, but without trying the outcome will always be failure. That said, the project will be huge if successful. However, I currently do not have money needed to hire developers. I have pretty solid experience in Java, GWT, HTML, Hibernate/Eclipselink, SQL/PLSQL/Oracle. The downside is project nature. All applications I've developed to date were hosted on single server or in small cluster (2 tomcats with fail-over). The application, if I succeed, will have to serve thousands of users simultaneously. The userbase will come from all over the world. (Consider infrastructure requirements similar to a social network.) My questions: What technologies should I use now to ensure easy scaling for a future traffic increase? I need distributed processing and data storage. I would like to stick to open standards, so Google App Engine or a similar proprietary cloud solution isn't acceptable. Since I do not have the resources to hire a team of developers and I will be the first coder, it would be nice if technology used is Java related. However, when you have a hammer, everything looks like a nail, so I am open to technologies unrelated to Java."

274 comments

  1. Cultivate Teams, Not Ideas by Anonymous Coward · · Score: 2, Insightful

    http://www.codinghorror.com/blog/2010/01/cultivate-teams-not-ideas.html

    1. Re:Cultivate Teams, Not Ideas by K.+S.+Kyosuke · · Score: 1

      ...unless you actually want to make any qualitative breakthrough. That would depend on ideas and individuals. But, yes, this is not likely to be a case for such approach.

      --
      Ezekiel 23:20
    2. Re:Cultivate Teams, Not Ideas by crutchy · · Score: 5, Funny

      teams are much better at solving problems than individuals

      even this slashdot forum could be thought of as a sort of team, in that many people are coming together to address a problem

      ok there is no leadership and its full of trolls, shills and idiots... maybe it's not really a team... more like a committee... ok so you're probably doomed

    3. Re:Cultivate Teams, Not Ideas by MightyYar · · Score: 4, Insightful

      Yup, his best bet is to find a good dick-head business type to partner up with and spilt 50/50 (or less if necessary). Edison died famous and rich. Much smarter men have died penniless and frustrated. Find an Edison and be his Tesla - but be smart enough to stake your claim in black and white.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    4. Re:Cultivate Teams, Not Ideas by phantomfive · · Score: 4, Funny

      even this slashdot forum could be thought of as a sort of team, in that many people are coming together to address a problem

      Good point. That's best argument against teams I've ever seen!

      --
      "First they came for the slanderers and i said nothing."
    5. Re:Cultivate Teams, Not Ideas by crutchy · · Score: 1

      you missed the next line

      maybe it's not really a team... more like a committee

      anyone who thinks highly of a committee needs their head examined

    6. Re:Cultivate Teams, Not Ideas by Anonymous Coward · · Score: 0

      Individuals create great software not teams. Teams are ok for factory goods I guess if you want to work on a assembly line. The key is to empower people with domain knowledge to create.

    7. Re:Cultivate Teams, Not Ideas by Anonymous Coward · · Score: 0

      teams are much better at solving problems than individuals

      Your first statement demands a citation as there is pretty good evidence to suggest that the opposite is true. vid. "There is an I in team" by Marc de Rond for starters.

    8. Re:Cultivate Teams, Not Ideas by Anonymous Coward · · Score: 0

      ok, so according to everyone, he's going to fail, whatever. So can someone answer his question, I think it's a good one

    9. Re:Cultivate Teams, Not Ideas by Anonymous Coward · · Score: 0

      Do the Zuckie Boogie, get in bed with the Fed, funding and 10,000 servers will materialize overnight. Scaling is not a problem.

    10. Re:Cultivate Teams, Not Ideas by crutchy · · Score: 1

      Individuals create great software not teams

      i would argue that individuals create great code, not great software, and that great leaders build great teams

      Teams are ok for factory goods I guess if you want to work on a assembly line

      assembly line workers aren't really what i would call "teams"... many don't even talk to each other while they work... maybe i would call them "gangs", as in definition 4 of http://www.thefreedictionary.com/gang

    11. Re:Cultivate Teams, Not Ideas by crutchy · · Score: 1

      if you have ever worked in a team environment the proof is obvious

      you have apparently not had the pleasure of such an experience

      good teams usually depend on a strong and competent leader (as opposed to a boss, who may not be a good leader)

      if you want a specific citation, may i suggest http://www.google.com.au/#output=search&q=teamwork

    12. Re:Cultivate Teams, Not Ideas by crutchy · · Score: 1

      i'm sure if you blew bernanke to his satisfaction you would be set for life (no servers or scaling required)

    13. Re:Cultivate Teams, Not Ideas by Taco+Cowboy · · Score: 2

      teams are much better at solving problems than individuals

      Please correct me if I'm wrong ...

      Based on my experience of past few decades (from the 1970's) in the tech field, the conclusion that I get is the reverse

      Teams are much better of IDENTIFYING problems

      On the other hands, people are much better at solving problems when they are in the "individual mode", than when they are part of a "committee", aka "teams"

      As I said, I may be wrong, and if I do, please correct me

      Thank you !

      --
      Muchas Gracias, Señor Edward Snowden !
    14. Re:Cultivate Teams, Not Ideas by Anonymous Coward · · Score: 0

      Errm, I believe Tesla would disagree strongly to this suggestion. Yea, I know "claim in black and white" bla bla. Business types are the kind who sometimes make their living on getting away with breaking such deals through sometimes very clever tricks.

    15. Re:Cultivate Teams, Not Ideas by MightyYar · · Score: 1

      Tesla "did it wrong". He was brilliant in many ways, but not in business.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    16. Re:Cultivate Teams, Not Ideas by Anonymous Coward · · Score: 0

      Even Edison got the shaft from JP Morgan. The company used to be called Edison General Electric.

    17. Re:Cultivate Teams, Not Ideas by Hognoxious · · Score: 1

      OK, you're wrong.

      I don't think "team" and "committee" are exact synonyms, plus different people have different expertise and it's rather difficult to bounce ideas off yourself.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    18. Re:Cultivate Teams, Not Ideas by Anonymous Coward · · Score: 0

      It is the other way around, but that guy ought to just start CODING and make something works some day, then he can worry on the team and the money and whatever is not covered by _Idea_. Or his idea cannot command those resources a la financing?

    19. Re:Cultivate Teams, Not Ideas by Anonymous Coward · · Score: 0

      even this slashdot forum could be thought of as a sort of team, in that many people are coming together to address a problem

      Good point. That's best argument against teams I've ever seen!

      No it's not!

  2. need more disclosure by Anonymous Coward · · Score: 0

    You've told us way too little for us to really help you. Also, you sound like one of those people who, ah, nevermind.

    1. Re:need more disclosure by jimmetry · · Score: 1

      people who what? PEOPLE WHO WHAT?! i need more disclosure *twitches*

    2. Re: need more disclosure by iamhassi · · Score: 1

      one of those people worried about where to hide all the gold they're going to have someday? "My app will be sooo successful! I need a team of people to make it for me because I don't know how!" how about you make your super successful app, and if anyone ever bothers to use it *then* worry about scaling it up, mmmmk?

      --
      my karma will be here long after I'm gone
    3. Re:need more disclosure by tnk1 · · Score: 2

      In short, build something with a best guess of what you will need, pick whatever you like to scale it with, and then when it is successful enough, sell it to another company who already knows how to scale it.

      If you have to, or insist on owning the idea and the company behind it, start hiring people to help you. Scalability isn't a one person job after a certain point. In fact, if you want to see if it is really enterprise scalable, don't even just show it to developers, throw the app on a box and let a real sysadmin look at the specs and poke at it. You should be able to glean how scalable and reliable it is by simply measuring the look of horror on his face.

    4. Re: need more disclosure by Anonymous Coward · · Score: 2, Insightful

      one of those people worried about where to hide all the gold they're going to have someday? "My app will be sooo successful! I need a team of people to make it for me because I don't know how!" how about you make your super successful app, and if anyone ever bothers to use it *then* worry about scaling it up, mmmmk?

      that's a terrible idea - the last thing you want is to have a terrible user experience (requests timing out etc) and deal with complete rewrite (which needs to be completed by yesterday) the moment your site becomes vaguely successful.

      This can easily kill your project at a critical moment - users don't like to hear "sorry, but we couldn't anticipate that (some blog/the local newspaper/...) would mention our site and people would actually try to use it; please come back in 3-6 months when we have rewritten it for scalability".

      Ignoring scalability "until I have users" is a great way to keep costs down while making sure that you cannot ever be successful. If you think that is the financially rational thing to do (because 99.9% of such projects don't succeed anyways) then you shouldn't sink money into a website that is (in your opinion) bound to fail at all. Butr if you are going to invest then you have to invest enough that you actually stand a chance at success (however slim that may be).

    5. Re: need more disclosure by Gr8Apes · · Score: 1

      Having been at several companies that do define scalability as an initial concern (ok, in one case being forced to realize it ;) Let's say that ignoring scalability and bolting it on later has the same effect as ignoring security and attempting to bolt that on later. You wind up sticking bandaids on bandaids, and that always leads to the same end.

      --
      The cesspool just got a check and balance.
    6. Re: need more disclosure by Quirkz · · Score: 1

      Ignoring scalability "until I have users" is a great way to keep costs down while making sure that you cannot ever be successful.

      And paying for enough resources to support ten million users when you don't even have an app is a great way to bankrupt yourself uselessly.

      I've seen both problems, more than once. I agree it's still important to plan for scalability, but today's virtual/cloud environments make the hardware part of the equation almost trivial.

  3. Check out the Evergreen ILS's Opensrf project by thirdpoliceman · · Score: 2

    http://evergreen-ils.org/opensrf.php I do not know much about it. Here is Dan Scott's first paragraph form his 'Easing Gently into OpenSrf' article: OpenSRF is a message routing network that offers scalability and failover support for individual services and entire servers with minimal development and deployment overhead. You can use OpenSRF to build loosely-coupled applications that can be deployed on a single server or on clusters of geographically distributed servers using the same code and minimal configuration changes.

    1. Re:Check out the Evergreen ILS's Opensrf project by natschil · · Score: 1

      Evergreen's OpenSRF is a great service. It's a shame that (to my knowledge) nobody apart from the evergreen people actually use it.

    2. Re:Check out the Evergreen ILS's Opensrf project by black6host · · Score: 1

      http://evergreen-ils.org/opensrf.php [evergreen-ils.org] I do not know much about it.

      I was reading that as OpenSerf.php. What a great idea! All the help you need for a pittance, if anything at all. I believe serfdom is highly underrated and I'm glad to see people bringing it back!

      Seems like the submitter wouldn't take kindly to an open environment lest he lose all his yet to be found new found riches so let's get the BSD types on board with UnlimitedHordes! Just be sure no one knows what anyone else is doing lest they put it all together and GIVE it away, no license needed!

  4. this will help by Anonymous Coward · · Score: 0

    http://www.rabbitmq.com/java-client.html

    1. Re:this will help by Anonymous Coward · · Score: 2, Funny

      and vector graphics because they scale like a leprotic fish.

  5. How can we help? by Anonymous Coward · · Score: 1

    Unfortunately I do not understand the requirements of your application. Scaling to thousands of users tells me nothing about where your bottlenecks are in terms of performance. What is your I/O requirements. How many users can you handle per server? Would a sharded database model work or how about an eventually consistent nosql database like Cassandra? Posting this here is honestly an exercise in futility.

    1. Re:How can we help? by Anonymous Coward · · Score: 1

      Or, it's not an exercise in futility. I think AleX122 will quickly figure out that instead of trying to code this huge and wonderful project up by himself, try to find actual qualified coders to help for just a promise of a windfall, etc, his best bet may simply be to design it all and see what kind of patent protection he can get.

      Besides, he will (or definitely should) design it all up before he starts coding. If he creates the design, he has something to protect, to show others to recruit, etc. One thing for certain is that posting that he has a nebulous idea and asking how he can realize step 4) Profit! is not going to get him very far on here.

      I'd like to establish a moon base. I wonder where I can get engineers who will donate their time or work on a promise of greatness should we succeed?

  6. Scala by Anonymous Coward · · Score: 1

    Learn Scala and use any of these great techologies: Play, Akka, Spray, Finagle

    They are all high-quality tools designed for scale.

    1. Re:Scala by Anonymous Coward · · Score: 0

      Right, because learning a bunch of new technologies guarantees you won't use them poorly!

    2. Re:Scala by Anonymous Coward · · Score: 0

      Using those tools poorly will result in a more scalable application than you'd get if you used many popular tools correctly. Plus, as you use them, you'll learn them and start using them less poorly.

      Scalability isn't about being an expert in a tool. It's about learning to deconstruct your application into components that work together so that you can scale bottlenecks individually. With the exception of Play, which is more of a web framework, the other three technologies focus on creating separation points to allow you to scale each side separately.

      A tool that just makes it simple to build a monolithic application will end up getting you only so far--likely to the point where the typical load-balanced app servers and replicated databases doesn't work anymore. That's when you need an SOA and that's when you'll need these tools.

  7. Heroku by Anonymous Coward · · Score: 4, Insightful

    Just use Heroku. Honestly you DO NOT need to worry about this problem. If you don't make enough money by the time you get 10,000 users to hire someone to solve this problem for you then your idea is not as great as you think it is.

    1. Re:Heroku by Baby+Duck · · Score: 5, Informative

      OpenStack. You can start with a hosting provider like Rackspace that has as a faithful implementation of it. I know they were recently pinged for some incompatibility, but they have vowed to fix that. If you still can't stomach it, choose a different OpenStack provider. OpenStack is the key.

      When you get really big, then you can work on running your own datacenter or paying someone to host the hardware for you (again, Rackspace, DreamHost, etc.). Then you can put your own implementation of OpenStack on the hardware with all the customization specific to your needs. This will naturally build on top of your years of investment with the vanilla OpenStack when you were smaller. The progression path is laid out for you.

      I'm replying to this parent because Heroku is also an excellent choice for scaling where you pay as you grow. I'm just not sure if you can later fork Heroku to suit your needs with the datacenter supplier of your choice.

      --

      "Love heals scars love left." -- Henry Rollins

    2. Re:Heroku by gabereiser · · Score: 2

      I was gonna mention that worrying about scaling before the app is built is a waste of time. Build the app. Get to your capacity. Modify app. Grow some more. Iterate that several hundred times and you build as you grow, you bill as you grow, and you scale as you grow. Don't fret worrying about how to serve to millions before you've served to thousands.

    3. Re:Heroku by Anonymous Coward · · Score: 0

      My understanding is that OpenStack and Eucalyptus and similar technologies help you automate tasks around creating, deploying, and configuring multiple virtual machines. That's great for scaling the infrastructure behind your application, but not the application itself. Does it include load balancing between servers, some kind of very easy to use distributed storage system (SQL or NoSQL), etc....?

    4. Re:Heroku by Anonymous Coward · · Score: 0

      Heroku and EngineYard are terrible. RoR in general is terrible for cost effective scale. It's just rapid developer coddling.

    5. Re:Heroku by Anonymous Coward · · Score: 0

      If you don't make enough money by the time you get 10,000 users to hire someone to solve this problem for you then your idea is not as great as you think it is.

      That's nonsense. 10,000 users aren't going to be profitable for many great ideas.

    6. Re:Heroku by Anonymous Coward · · Score: 1

      You can start with a hosting provider like Rackspace that has as a faithful implementation of it.

      OpenStack to start cracking down on incompatible clouds

    7. Re:Heroku by Jane+Q.+Public · · Score: 1, Insightful

      "OpenStack. You can start with a hosting provider like Rackspace that has as a faithful implementation of it."

      Ahem. Just 2 days ago an article discussed here on Slashdot pointed out that Rackspace is not compliant with OpenStack standards.

    8. Re:Heroku by Anonymous Coward · · Score: 0

      Congrats, did you read the next sentence?

    9. Re:Heroku by Anonymous Coward · · Score: 1, Informative

      Did you even bother to read the next sentence after that?

    10. Re:Heroku by rwa2 · · Score: 1

      Maybe his application is performance critical, and that's why the others in the domain have failed. I think performance / latency was always a priority with Google when they entered an already-crowded search engine space, and that was one of the main things that drew people to their service from the established competition.

      Also sounds like he already has his prototype app working on 2 boxes, and doesn't want to pull an EA by launching with that. It's non-trivial to just scale out N instances if they all have to coordinate back to a few common bottleneck databases that don't scale.

      The established scalability architecture I see a lot of these days looks like some sort of CDN in front of one or more datacenters each which consist of a load balancer (such as nginx) that spreads the load out between a dynamic pool of web/application servers (php / tomcat), which access a dynamic pool of cache storage (memcached / redis) in front of some kind of DB backend (mySQL with a few read replicas, or some other DB tuned for sustaining the max writes sanely enough to perform backup / restore on the entire persistent dataset)

    11. Re:Heroku by Anonymous Coward · · Score: 0

      Does the next sentence matter? Rackspace is not compliant. The second statement does not invalidate the first and requires you to click a link and read through an article to even understand the relationship. The link is establishing that but the first statement is factually incorrect. He could have just stated "You can start with a hosting provider that has as a faithful implementation of it." but fucked up. He was rightly corrected.

    12. Re:Heroku by Jane+Q.+Public · · Score: 1

      "Congrats, did you read the next sentence?"

      What difference does it make? A promise to make things better "by next year" does not make Rackspace any more compliant with OpenStack today.

    13. Re:Heroku by Jane+Q.+Public · · Score: 1

      "Did you even bother to read the next sentence after that?"

      Yes. But his next sentence hardly matters. It doesn't change anything. They are compliant, or they are not. They promised to become compliant by next year. Big deal.

    14. Re:Heroku by Anonymous Coward · · Score: 0

      Since you are speaking of Java, CloudBees (http://www.cloudbees.com/) provides a pay as you go model, starting at free. They can cover development, testing, and deployment. As you scale, you pay them more for the additional bandwidth/storage/CPU.

    15. Re:Heroku by Anonymous Coward · · Score: 0

      Heroku does more than Ruby on Rails.

  8. Show me the users! by Anonymous Coward · · Score: 5, Insightful

    Before going all-out to reinvent the wheel on yet-another-next-big-thing web app, why not roll out a proof-of-principle version letting someone else competent do the "heavy lifting" back-end work. Use an existing cloud/hosting service like Amazon EC2 (they'll do a lot better on the basic back-end stuff than your "I'm incompetent but building a cloud app anyway" approach). After you get your first hundred thousand users, and have investment rolling in by the gazillions, then you hire your own crack team of cloud experts to design your own custom back-end solution (or just sell out for a couple hundred million to whatever group of suckers thinks your zero-dollar-per-user profit model will start paying off once they hit the million-user mark).

    1. Re:Show me the users! by Anonymous Coward · · Score: 0

      So...hire 14-year-olds?

    2. Re:Show me the users! by buybuydandavis · · Score: 1

      Sounds right to me. Protoype quickly. Don't worry about scaling. Get it out there and see if it works. If it does, then you worry about scalability.

    3. Re:Show me the users! by ATMAvatar · · Score: 5, Interesting

      This. The submitter has made an assumption that there will be hundreds of thousands of users. There might not. The only sure thing is that if he spends all his time trying to build a platform capable of serving hundreds of thousands of users right out of the gate, the project will probably fail before a single user sees it.

      Remember: not even Facebook, Twitter, or eBay started off with platforms capable of handling their current load. They all started with something quick and built things out as their respective user bases grew.

      --
      "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety."
    4. Re:Show me the users! by Anonymous Coward · · Score: 0

      "whatever group of suckers thinks your zero-dollar-per-user profit model will start paying off once they hit the million-user mark"

      Excellent!

    5. Re:Show me the users! by mooingyak · · Score: 4, Interesting

      Pretty much the same thought I had.

      Step 1 is to get a version that works for one user.
      Step 2 is to get more than one user.

      You're jumping a few steps ahead of the game.

      --
      William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
    6. Re:Show me the users! by dbIII · · Score: 1

      You can see from the post above why professional engineers laugh at the self proclaimed "software engineers". Basket weaving on the fly is different to design.

    7. Re:Show me the users! by durdur · · Score: 1, Insightful

      True enough, but you do not want to have the issue where the first sign of your success is your website failing. Early users get turned off if the service is flaky. So you can't just throw up a free website and wait to see when and where it crashes. A little planning is always good and so is a good reasonable starting architecture. That would include for example designing from the start for running with multiple backend servers behind a load balancer.

    8. Re:Show me the users! by Nerdfest · · Score: 1

      PEngs will soon be able to so the same thing they laugh at software people for when 3D printing is more mainstream. And they will use it for exactly that.

    9. Re:Show me the users! by AleX122 · · Score: 1

      I am not going to build a platform capable of serving thousands of users right out of the gate. However I do not want to end up having to do revolutionary refactoring (instead of evolutionary one) or even rewrite the application when it start failing due to too much traffic. That is why I want to prepare my application as well as I can for scaling, and of course when it starts making some profits I will hire more people and probably someone more skilled and with experience with cloud solutions.

    10. Re:Show me the users! by Tablizer · · Score: 1

      not even Facebook, Twitter, or eBay started off with platforms capable of handling their current load.

      And still cannot handle it ;-)

    11. Re:Show me the users! by Anonymous Coward · · Score: 0

      Wrong!

      Build scalable!

    12. Re:Show me the users! by megalomaniacs4u · · Score: 1

      You will have to major rewriting whether you want to or not. You will not cover every base out of the gate. Your users will do stuff and break stuff that you will never think of at this stage. Hell you may come up with a clever way to implement a feature - but that may only occur several months after the users have been exposed to the product.

      It is just a fact of life. No plan is perfect, no software is perfect, and nothing ever survives the userbase intact.

    13. Re:Show me the users! by rcharbon · · Score: 1

      "Step 1 is to get a version that works for one user. Step 2 is to get more than one user." Step 3: Piss off your early adapters when too many users kill the system. Plan for success. If you're not expecting to succeed, why bother doing it at all?

    14. Re:Show me the users! by Gr8Apes · · Score: 1

      I have built and am still building platforms scaling to tens of thousands of concurrent users. I have been doing this for more than a decade. It's not exactly difficult to do so, what is difficult is making sure you don't use the latest gee whiz widget that does things in a non-scalable way, or have developers do short-cuts under the covers that cause issues. To do so successfully requires understanding your entire selected stack and designing your layers to avoid as many bottle necks as possible. Another big one is to design those layers, because that's how you'll be scaling your app. You'll also have to make some choices: High Performance, High Availability, High Reliability, pick any two, was what we used to say. At 2 companies, we did all 3. Both were bought, hence I moved on. Since 2000, they have all been at least partially Java based.

      --
      The cesspool just got a check and balance.
    15. Re:Show me the users! by Gr8Apes · · Score: 1

      exactly! Building to be scalable is not all that hard, but it does require a level of understanding.

      --
      The cesspool just got a check and balance.
    16. Re:Show me the users! by Anonymous Coward · · Score: 0

      True enough, but you do not want to have the issue where the first sign of your success is your website failing.

      Like Twitter.

    17. Re:Show me the users! by mooingyak · · Score: 1

      The original question came from someone who sounds like he doesn't have experience build large scale apps. If he tries to solve that problem first, he's going to either waste time when a real demand never arises, or else waste time when he finds out he did his scaling all wrong. Until you've been doing it a while, the bottlenecks aren't as obvious as one might imagine.

      --
      William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
    18. Re:Show me the users! by Anonymous Coward · · Score: 0

      You do have to understand the areas of your application that will scale non-linearly. The quintessential example of this was Friendster. They did everything you've suggested but they had one feature that scaled exponentially and they didn't bother to design it in a way that was scalable. The result was that it started taking 20 seconds to compute the size of your friend network and that happened on every login.

      Had they thought ahead and designed their application to scale (and not made the mistake of killing off the Fakesters), they might be the ones with the $65b market cap and Facebook might have never caught on.

    19. Re:Show me the users! by Quirkz · · Score: 1

      nothing ever survives the userbase intact.

      And thus was the eleventh commandment written!

  9. Scalability by t00le · · Score: 2

    I would likely build a front-end using a couple HAProxy load balancers hitting an Apache cluster running opencluster. Use red-black trees with mySQL and cluster a few databases across multiple locations. I would build the front-end with Python and html5, as well as using iphython for cluster controls and other fun stuff.

    In my case I have a rack of HP p-class blade servers that use an Amazon EC2 Centos box to route inside/outside of EC2. When we test something out we use my cluster at home, then when we roll an app or website out we keep it at my house. If the load gets high, then we simply modify the cluster to bring up slave web servers, cache servers, etc. In our case we build the backend first and can roll out an app or web service for very little money or resources, but if we have success with something we just leave it on EC2 since it can likely pay its own bills.

    --
    When the only tool you have is a hammer, every problem looks like a nail
    1. Re:Scalability by Anonymous Coward · · Score: 0

      If you're paying for power and don't have a capital investment, AWS is about the best you'll do in terms of cost and flexibility. Here is their published reference architecture for a scalable web app.
      http://media.amazonwebservices.com/architecturecenter/AWS_ac_ra_web_01.pdf
      The key is to make sure state is only persisted on the user side or in something centrally/universally accessible like a database with replicas or a distributed database like dynamodb. This way you can scale instances of the app server and web server with load.

      With the geographic distribution you mentioned, using route53's latency based routing, you can setup in multiple AWS regions and have the traffic route accordingly.

      And finally to reference the users comment before, while he has a point about you finding them should be a top priority, with the pay per use model of cloud, designing and building properly to handle scale isn't much more expensive, as your consumption scales to your load (user base). That said I would ignore any notions of building junk that works immediately only to have to refactor it when it gets popular. That tends to lead to poor customer experiences.

  10. nope by Frosty-B-Bad · · Score: 1

    You either use a cloud service like Microsoft's or Amazons so you can scale quickly to meet demands, or go all hax0r and build your own farm out and 'stick to open standards' blah blah, which you'll most likely get so side tracked administering your servers your project will slip into the background. Who cares if your code is some type of 'open standard' when it doesn't exist. If this bazinga idea will be used 'world wide' and make buckets of money, make the money and then open all your stuff up like all those other corporations do

    1. Re:nope by Anonymous Coward · · Score: 0

      Correct you need to utilize the work done by others much smarter than you (no offense). I think amazons framework will let you deploy almost anything. App engine has its own libraries for data management and just about everything else under the sun but they are working on their maven plugin to allow jax-rs project to deploy, hopefully as bundled archive files. I don't know about azure but im sure its similar. The moral is, get to making Content and not trying to manage terabytes of data and hadoop clusters

  11. Silly priorities by Anonymous Coward · · Score: 5, Interesting

    Youtube was a lame app with basic mysql setup. Same with Facebook. When it took off, they hired gold people and fixed the scalability issues. Twitter didn't exactly put scalability first either.

    So get real. Don't worry about "hundred of thousands of users", but about getting something decent out there for users to try. If users come, you'll get scalablity sorted out.

    1. Re:Silly priorities by phantomfive · · Score: 2

      Twitter didn't exactly put scalability first either.

      Which is strange to me, because Twitter seems like such a simple website, and yet they had massive scaleability problems. Even today, sometimes. I've wondered what is so difficult about their website that has caused them such problems?

      --
      "First they came for the slanderers and i said nothing."
    2. Re:Silly priorities by the+eric+conspiracy · · Score: 3, Informative

      It wasn't the website, it was the backend that had problems.

      I remember they had started with Ruby on Rails which is notorious for being able to get you up fast and then failing to scale.

      They then offloaded parts of the infrastructure to Scala of all things.

      http://blog.redfin.com/devblog/2010/05/how_and_why_twitter_uses_scala.html

      Scala is interesting and has some good paradigms built in to the language for the things Twitter needs to do. Not sure if it is really fundamentally better than Java though - after all it runs on the same JVM.

      Anyway if I was starting something like this out and I already knew Java I would go with Java. There are enough large sites running it, and there are a lot of people out there who know it so I would feel some confidence that I could do what I needed to do.

      Plus I like static typing.

    3. Re:Silly priorities by Anonymous Coward · · Score: 1

      If you like static typing and buy into the current craze over functional programming, Scala is awesome. Even if you just translate a Java program into Scala class by class, the Scala equivalent code will let you cut the code size in half. If you start using some of Scala's better features, like easier higher-order-functions and case statements on steroids, you will cut the code size even further. And you can still use Java libraries and Maven, run your applications with Netty or Glassfish, Run them on Heroku, etc...

    4. Re: Silly priorities by Anonymous Coward · · Score: 0

      It's cause they used a toy of a framework (Ruby on Rails). When they ditched that garbage and went to an ecosystem designed, built and proven for scale (Java - after a brief detour being dazzled by the latests trends [Scala]) the fail whale went largely extinct

    5. Re:Silly priorities by Anonymous Coward · · Score: 0

      More than the backend, it was the sudden and massive onslaught of users.

    6. Re:Silly priorities by Anonymous Coward · · Score: 5, Informative

      "They then offloaded parts of the infrastructure to Scala of all things.

      http://blog.redfin.com/devblog/2010/05/how_and_why_twitter_uses_scala.html [redfin.com]

      Scala is interesting and has some good paradigms built in to the language for the things Twitter needs to do. Not sure if it is really fundamentally better than Java though - after all it runs on the same JVM."

      Disclaimer: I was a developer at Twitter until last year.

      From the point of view of scalability, Scala is so much more advanced than Java it's not even funny. Ultimately, this boils down to the adoption of immutability as a core concept of the language. In particular, Scala's approach to concurrency is a decade or more ahead of what's in use in Java. Finagle, Twitter's async RPC system, simply wouldn't have been deliverable in a language that makes the use of Futures as difficult as Java does.

      "Plus I like static typing."

      Scala is statically typed.

    7. Re:Silly priorities by rekoil · · Score: 3, Informative

      Disclaimer: Another Twitter engineer here. What my apparently former colleague said, plus X.

      Also: Don't be afraid to add caching layers when you see your web server or DBs start to run hot. Putting a memcached instance in place in "front of" your database layer is much easier than sharding the database layers to relieve load - eventually you'll have to do both, but you'll definitely want the memcache layer first. Same with web caches/proxies - putting varnish or squid in front will take some pressure off before you need to implement load balancers.

    8. Re:Silly priorities by Anonymous Coward · · Score: 0

      ... Same with web caches/proxies - putting varnish or squid in front will take some pressure off before you need to implement load balancers.

      +1 for using varnish (which already includes loadbalancing technology)

    9. Re:Silly priorities by miroku000 · · Score: 1

      If you like static typing and buy into the current craze over functional programming, Scala is awesome. .

      Wait, what? There is a functional programming craze? When did that happen?

    10. Re:Silly priorities by DuckDodgers · · Score: 1

      It's not nearly as big as object-oriented programming, but:

      Scala allows procedural programming but can be written in a totally functional style.
      Clojure (another JVM language, and a Lisp dialect) is functional by default, uses laziness in a few key places, and requires extra steps to operate in a procedural style.
      Java 8 adds Lambdas.
      The D language version2 has the "immutable" keyword, which is const on steroids because it's transitive. immutable x = .... means x is immutable and everything x references is immutable too.
      Perl6 has its own way of doing Haskell/Scala style pattern matching on function calls: http://perlcabal.org/syn/S06.html#Unpacking_tree_node_parameters

      etc... etc.... it's not exactly taking the industry by storm, but it's getting a lot of attention and while few purely functional languages (i.e. Haskell) are going mainstream, lots of mainstream languages are borrowing features from the functional world.

    11. Re: Silly priorities by Anonymous Coward · · Score: 0

      I'm pretty sure they're still using Scala.

  12. Start smaller by bfandreas · · Score: 5, Insightful

    Do not plan for hundreds of millions of concurrent users at once right off the bat. That's the very common error a lot of startups make. You do not have such a large userbase. It will take some time until you have.
    Think smaller and scale up when your idea takes off. Set yourself concurrent user milestones when you rethink your architecture. You will also have to rethink the iron your stuff runs on and that may dictate what kind of technology you use when you reached your hundreds of millions goal.

    Technology is interchangeable. It's a tool and you choose the best tool for the job and at the moment you have no users and might as well start off with the usual suspects. JSP/Struts, JSF, whatever you are most comfortable with. If in the long run you do find that this is not sustainable and you need to shift to another technology then you can hopefully afford to hire people who know it.

    You really, really should set yourself userbase milestones, plan ahead for reaching them and be prepared when you reach them. For that you need a lot of information. Log how much time users spend on what functionality you offer because this also has an impact on your UI design when you go big. It also has impact on what technology(-ies) you use.


    I usually bill big when I give advice such as this and help setting up a plan when to do what. Your problem is less one of technology but a business one. Think like a businessman first and like a techie second.

    --
    20 minutes into the future
    1. Re:Start smaller by Anonymous Coward · · Score: 0

      Yep. Some more advice: keep your codebase neat so you can stay agile. You need unit tests for everything, and functional tests (through the front end, simulating actual site visitors) for everything. If you have that, you can reimplement any piece of the system that's preventing you from scaling using a different technology and be confident you haven't broken the site. Keep your design well layered; application logic shouldn't be in the database (avoid stored procedures); database logic shouldn't be in the application. Similarly for user interface. And reporting. And logging. And session management. With all of those compartmentalised, you should be able to make major changes to your back end easily and quickly and keep growing your site as and when you need it.

  13. Plan9 from outer space. by SampleFish · · Score: 1, Informative

    It would be cool if you could get Plan 9 working. It's an OS that was designed around distributed computing from the ground up. So much so that the API is hardware agnostic. It doesn't matter what hardware you are running or where it exists. All resources in the cluster are shared automagically. You would need some distributed rackspace in strategic global locations.

    Step one is making a small lab with junk computers.
    Step two is testing your application in this environment.

    If you can get the backend running on Plan 9 then you can start renting servers, installing Plan 9 and adding these servers to your existing cluster. At some point you will be able to turn off the computers at your house and the app will keep running on the remaining cloud servers. It's a pretty sweet idea.

    http://plan9.bell-labs.com/plan9/

    It's kinda like UNIX.

    Good luck.

    1. Re: Plan9 from outer space. by Anonymous Coward · · Score: 0

      Yea good luck with that. I hear Mars needs Women too

    2. Re:Plan9 from outer space. by Anonymous Coward · · Score: 0

      Totally off topic shill post.

  14. The scalable app skeleton; Java flavored by jwkane · · Score: 2

    Java... ok, why not. I would take a look at Cassandra and Zookeeper to get the ball rolling. You'll need a good load balancer; nginx or haproxy since I don't know of a good one in Java. I assume a bunch of tomcat servers for the actual app. I suppose jboss messaging to keep with the java theme.

    You can get all that on one machine for development, then for deployment you can flexibly adjust the number of db servers, queue servers, load balancers and app servers based on anticipated load. If you're extra-cool you can deloy to a cloud and dynamically allocate servers as-needed.

    Been there, done that. Got the t-shirt. It's fun. Enjoy it.

    Spend an extra day or two thinking about exactly how you're going to handle logging. It will be worth it.

  15. The hammer you know... by Anonymous Coward · · Score: 1

    Is far better than anything else.

    If you are serious then don't waste time and effort staying pure. Get the job done as fast as possible with the best tools you already know how to use.

    Amazon (AWS), Java (dropwizard is simple container) and mysql (or whatever).

    If you luck out and actually do get 100K users then i suspect you wont care about purity so much. You'll have and actual business to run.

    1. Re:The hammer you know... by Anonymous Coward · · Score: 0

      Like he said... except don't use MySQL. There's not any reason to use MySQL when PostgreSQL has more functionalities, scales better and won't throw your data under the bus. Plus it's as easy to install and use.

  16. Not trying means certain failure by Skapare · · Score: 1

    OTOH, it could mean success in another idea. The problem is you will never know unless you try everything, and resources usually limit that. You have to decide.

    You may be able to do proof of concept with a couple cheap servers. But if it succeeds, time will be extremely short to go to full scale. So you need to think scale up front, but in a way that works downscaled as well. Do everything agile so every component can work all on one server, or separated on many. Use a distinct hostname for everything and let DNS figure where you put it. Use a distinct IP address for every service component, even if on the same machine, so you can separate them tomorrow without having to renumber. Use highly scalable front ends even if the fronts and backs share the same machine for now. Yes, a single interface can have many IP addresses on BSD or Linux.

    Consider a no-SQL option for you data, if it can fit that.

    --
    now we need to go OSS in diesel cars
    1. Re:Not trying means certain failure by Anonymous Coward · · Score: 0

      Holy christ, did you just toss a bunch of vague generalities sprinkled with buzzwords, and call it advice?

      "Do everything agile so every component can work..." -- what does that even mean?

      Use a distinct hostname - chances are, he'll want a load balancer & pools of components.

      Use a distinct IP address - stupid. Once again, he'll probably want a load balancer & pools of service components.

      Yes, a single interface can have many IP addresses -- you seem not to understand what an "IP address" is.

      "Consider a no-SQL option" - this is meaningless, since we have no idea what his data will look like, and is so vague that, even if we DID know what his data will look like, it would be useless anyway.

    2. Re:Not trying means certain failure by Anonymous Coward · · Score: 0

      There are two cases to be made for nosql.

      The first is if you have a very limited idea of what your data structure will look like. If this is true, the first question you should be asking is what business it is exactly that you're getting into. If you really have no domain-specific info that needs to be enforced at the data persistence layer, what do you have?

      The second is if you are scaling past the point where a relational database will suit your needs. If you are operating the scale of twitter, eventual consistency is the only consistency you're going to get.

      For every other situation, you should be using a relational database. Does that mean that 90% of those who are using NoSql are doing wrong? Absolutely.

    3. Re:Not trying means certain failure by Anonymous Coward · · Score: 0

      I read a PostgreSQL developer make the argument that even for NoSQL, Postgres might be your best option. Postgres can run a bunch of tables with columns "key" (GUID) and "value" (text or binary) just fine. I thought it was an interesting point, but I haven't monkeyed around with NoSQL enough to compare them.

  17. Get Partners then Figure it Out by Kagato · · Score: 2

    Can't do it yourself, then get partners. Set up an equity agreement.

    As far as tech this is no longer new territory. Create server images for a cloud host such as AWS or Rackspace. Bring them up or down with Chef. Concerned about Database? Figure out if you really need a relational database. If not look at a high performance NoQL DB or something that is more or less always in Memory (such as Mongo).

    1. Re:Get Partners then Figure it Out by Anonymous Coward · · Score: 0

      Having built large scale ( tens of millions ) applications in the cloud this is a terrible idea. Specifically AWS and Mongo. If you want scalability headaches go that route.

    2. Re:Get Partners then Figure it Out by Anonymous Coward · · Score: 1

      Definitely run MongoDB. It's webscale.

    3. Re: Get Partners then Figure it Out by Anonymous Coward · · Score: 0

      You turn it on and it scales right up

    4. Re:Get Partners then Figure it Out by Ash-Fox · · Score: 1

      Following your example of using Mongo, I am interested in hearing how you counter the arguments in this article and why Mongo is a better fit for any task, please.

      --
      Change is certain; progress is not obligatory.
    5. Re:Get Partners then Figure it Out by Kagato · · Score: 1

      You have to match the technology for what you're doing. The article makes some valid points (fail-over on write), but it's a bit dramatic about others (performance in particular). But like anything in high performance computing you really have to understand what's happening. That's in particularly true about how Mongo uses memory.

      We successfully used it to report election results (national to local across the US, millions of updates) for several organizations simultaneously. This is a site that handles billions of hits. I saw a fantastic demonstration for DNA sequencing research. They ran it head to head against several other conventional databases. It was an application that didn't need a high volume of writes and Mongo happened to be a good low-effort fit for them.

      Is Mongo right for everything? No way. But it's dramatic to say it's not fit for any task because it can and has been used in several high performance applications.

  18. VPS + OpenStack? by DeeEff · · Score: 1

    Sounds like you may want to check out hosting your stuff over a VPS, maybe with Hawkhost (http://www.hawkhost.com/vps-hosting) or some similar provider?

    I guess the general idea is that you'd want to install / set up your own OpenStack (cloud) solution, and then scall VPS coverage if you need it, without having to install / clone over multiple machines. Check out Openstack and Java integration. As far as I know there's an SDK available: https://github.com/woorea/openstack-java-sdk, but I'm not sure how complete it is, what features it offers, or even how you would go about setting up your project, considering how vague you were in TFA.

    In any case, this may be a good starting point for you to look. Alternatively you could host everything out of your own house on your own servers, but that scales terribly if you need to buy 50 more servers, so I wouldn't recommend it.

    1. Re:VPS + OpenStack? by Anonymous Coward · · Score: 0

      Someone asks about how to sculpt and the first thing you do is whore some advert about a random chiseling tool. He has no idea how to design his app to make it scalable. Who the fuck cares which VPS provider he's using?

      I'm not sure how complete it is, what features it offers, or even how you would go about setting up your project

      Alternatively you could host everything out of your own house on your own servers, but that scales terribly if you need to buy 50 more servers, so I wouldn't recommend it.

      Got any other retarded alibi "advice", Sherlock? Seriously, why the fuck do you even post?

  19. Bitcoin Exchange? by Anonymous Coward · · Score: 0

    I know, MT. Gox sucks

    But, we are not here to give free advice to those that will take our money later

    1. Re:Bitcoin Exchange? by Anonymous Coward · · Score: 0

      I'm pretty sure you're on the money here. Unfortunately, with the competence displayed by OP I don't think it'll ever take off and entertain me by crashing and burning like all the other half arsed exchanges that have come and gone.

  20. Premature optimization by kasperd · · Score: 4, Insightful

    This sounds very much like premature optimization. You may end up designing a very scalable application and have the project fail due to too few users. If the actual number of users turn out to be an order of magnitude less than what you can handle on a single host, then all that scalability work was wasted. I think you have better chance of success with a quick proof of concept, which isn't very scalable.

    It is ok to think about scalability before you have the users. But don't waste time implementing the scalable solution for a non-existing user-base.

    --

    Do you care about the security of your wireless mouse?
    1. Re:Premature optimization by Kjella · · Score: 4, Insightful

      Not to mention scalable is also relative, if you are a smash hit and need to upgrade fast you can get a 10G link to the backbone with an 8-socket Xeon E7-8870 server, a ton of memory and a RAID array of SSDs as a pretty damn good stop-gap, which I assume you can't afford now since you can't afford to hire developers. There's probably a bunch of other optimizations you can do too in order to offload parts to other machines when you get that far. This is like asking "Will the wind resistance of my afro keep me from breaking the world record on 100 meter dash?", start caring about that when you get below 10 seconds not when you're considering a running career and don't count getting a haircut as the first step of the way.

      --
      Live today, because you never know what tomorrow brings
    2. Re:Premature optimization by UnknownSoldier · · Score: 5, Interesting

      Agreed. This guy doesn't really understand scalability.

      The OP needs to read how Plenty of Fish started off:
      http://highscalability.com/plentyoffish-architecture

      * PlentyOfFish (POF) gets 1.2 billion page views/month, and 500,000 average unique logins per day. The peak season is January, when it will grow 30 percent.
      POF has one single employee: the founder and CEO Markus Frind.
      * 30+ Million Hits a Day (500 - 600 pages per second).
      * 1.1 billion page views and 45 million visitors a month.
      * Has 5-10 times the click through rate of Facebook.
      * 2 load balanced web servers with 2 Quad Core Intel Xeon X5355 @ 2.66Ghz), 8 Gigs of RAM (using about 800 MBs), 2 hard drives, runs Windows x64 Server 2003.

      And also about NginX:
      http://www.aosabook.org/en/nginx.html

      If you "need" multiple servers when you are first _starting_ out you're probably focusing on solving the wrong problems.

    3. Re:Premature optimization by loneDreamer · · Score: 2

      It may be worth it to spend a little time thinking in peculiarities of you data that may greatly reduce scalability problems. For instance:

      1.- If your data or user base can be easily partitioned
      2.- If you can get away with low consistency semantics

      If you can find a nice architectural design that has any of these characteristics, many bottlenecks can be removed and scaling up in the future may prove easy. In those cases, there are abundant technology solutions that you could pick up in the future.

      So don't try to solve the whole issue before it is actually an issue, but at the same time try to future-proof your architecture. It is true that many well-known companies started small and did a full transformation later on, when they had more resources, but then again they did pay a steep cost for doing so.

    4. Re:Premature optimization by Nimey · · Score: 1

      That was in 2009. I certainly hope he's not still running Server '03, for starters.

      --
      Hail Eris, full of mischief...

      E pluribus sanguinem
    5. Re:Premature optimization by jgrahn · · Score: 1

      It may be worth it to spend a little time thinking in peculiarities of you data that may greatly reduce scalability problems. For instance:

      1.- If your data or user base can be easily partitioned
      2.- If you can get away with low consistency semantics

      If you can find a nice architectural design that has any of these characteristics, many bottlenecks can be removed and scaling up in the future may prove easy. In those cases, there are abundant technology solutions that you could pick up in the future.

      Or perhaps this idea of his doesn't have to be centralized at all. It seems to be a knee-jerk reaction today to "N users will want my FooBar idea, therefore I need one big www.foobar.com web application which handles all of them". Might be true given the FooBar idea -- or might not.

      Usenet, Git and BitTorrent are some counter-examples.

    6. Re:Premature optimization by kasperd · · Score: 1
      1. If your data or user base can be easily partitioned
      2. If you can get away with low consistency semantics

      I agree, those properties make scalability much easier. There is another possibility, which is if your data is mostly static. If you can simply copy your data to a bunch of servers and be done with it, then scalability is easy.

      The real killer is if you have strong consistency requirements, and you have users worldwide, and data cannot be partitioned since users around the world need to read and modify the same data. If all of those are present you will be constrained by the speed of light, and putting all the smartest engineers in the world on one project won't increase the speed of light. You'll end up either relaxing the consistency requirements, artificially partitioning data, or increasing end user latency.

      --

      Do you care about the security of your wireless mouse?
    7. Re:Premature optimization by Anonymous Coward · · Score: 0

      Varnish + Nginx + memcache + some sql db for transactional data + some nosql db for non-transactional / transient data

      Done...

      You can stick this on top of almost all platforms, and it will run nice, to a point where you will be able to get more dev / sysadin / devops to sort the issues out...

    8. Re:Premature optimization by Anonymous Coward · · Score: 0

      You left out the most important ingredient: Akamai.

    9. Re:Premature optimization by Gr8Apes · · Score: 1

      I've seen one of those solutions in progress - the system was so screwed by bad design that the 8-proc servers couldn't handle the load of 1000 users. It wound up being a full rewrite, after which we could handle 5K users per smaller server, with multiple servers scaling out, and the original DB hardware handled more than 100K users, whereas the DB for the original system couldn't scale up fast enough. You should always make sure to practice good design, it only costs you a little time to think, which is a tiny tiny fraction of the time/cost/effort for what's needed in comparison to rewrites.

      This does not mean that you should over think the problem and spend years on design. I've mentioned before, there's an almost cookie-cutter approach to the server side functionality for pretty much all apps, with a few choices to accomplish your desired path of scalability.

      --
      The cesspool just got a check and balance.
    10. Re:Premature optimization by iMadeGhostzilla · · Score: 1

      And even if you do end up with 100s of 1000s of real users, the app and therefore the architecture will be different than what you're projecting in your mind now. "Premature optimization" is right on the money.

    11. Re:Premature optimization by Anonymous Coward · · Score: 0

      Besides, the drag of the afro will help you train anyway :P. Only ever cut your hair the day you run in the olympics.

  21. Start simple by joshv · · Score: 2

    Probably the worst thing you can do is start with some complex clustered architectural design.

    Just start on a single server with technologies that are scalable, and design with future scalability in mind. Also design in the ability to capture detailed performance metrics of every tier. When, and if your application usage grows, scale the parts of it that need scaling.

    The biggest issue with scaling is usually the database, and for applications where you are just using the database as a simple persistence store for user settings and simple small data sets, you are probably best to go with one of the many scalable "NoSQL" type solutions such as MongoDB, as they've got scalability baked in for free. If you're trying to run heavy duty analytics that join and aggregate massive datasets, there are single DB clustering solutions, but they aren't cheap. You can always scale out SQL databases horizontally, but then you've got issues cloning and replicating, though there are a lot of products in that space, both free and commercial. A cheap place to start would be with PostgreSQL, which appears to have multiple open source replication products.

    I don't think there is anything inherently limiting to sticking with Java. It's what you know, and the toolsets are deep and rich. No, it's not the hot new thing, but sometimes that can be a good thing.

    1. Re:Start simple by Ash-Fox · · Score: 2

      The biggest issue with scaling is usually the database, and for applications where you are just using the database as a simple persistence store for user settings and simple small data sets, you are probably best to go with one of the many scalable "NoSQL" type solutions such as MongoDB, as they've got scalability baked in for free.

      Are you a troll, malicious or just plain not knowledgeable?

      --
      Change is certain; progress is not obligatory.
    2. Re:Start simple by sourcerror · · Score: 1

      But but but it's webscale!

  22. PHP by phantomfive · · Score: 2

    Facebook did it on PHP. I sure wouldn't have used that, but it shows you can do more with basic technologies than you would expect.

    The Java environment was built for that kind of thing, Spring, Hybernate, etc, so if you build in that, you can be reasonably sure your system will be scaleable.

    Keeping session state in RAM will make your life harder.

    Even with a 'slow' technology, you can always add more servers. The difficult bottleneck is the database, and that can be an intractable problem depending what your goal is.

    --
    "First they came for the slanderers and i said nothing."
    1. Re:PHP by Anonymous Coward · · Score: 1

      No one sane building web apps does it with java anymore, unless you can afford to hire twice as many people for standard production. And then you still have a pile of shite.

    2. Re:PHP by dbIII · · Score: 1

      I've had to upgrade a pile of desktop computers that were handling work related tasks with no problems but were brought to their knees once a web browser was pointed at that piece of shit Facebook was initially. They got a lot of things wrong to start with, maybe some from crap PHP, others from spitting in the face of web standards to shove more ads down people's throats. Facebook's success is due to selling the concept to advertisers and not due to their crappy initial implementation or whatever it is now.
      So maybe that's the answer - getting it running to the satisfaction of those that are going to fund it is the important first goal and easier than the end goal. Dropbox had an even worse time starting up with utterly stupid newbie mistakes in their buggy python middleware to Amazon storage producing a variety of entertaining security breaches but they had managed to convince enough people that they survived. Whatever dropbox is now it probably doesn't have a single line of code left over from it's quick and nasty demo state that was inflicted on early adopters.

    3. Re:PHP by phantomfive · · Score: 1

      I'm sure those computers wouldn't handle facebook any better now. There's a lot on those pages.

      --
      "First they came for the slanderers and i said nothing."
    4. Re:PHP by dbIII · · Score: 1

      The thing that pissed me off the most back then was the forced loading of everything on those pages every minute and a variety of tricks designed to prevent proxies taking the load (eg. pretend the page is ten years old so needs a refresh - then once you circumvent that trick they add another). Now computers are faster and bandwidth is cheaper but there's still a lot of it as antisocial to the internet as viagra spam in many ways.
      Having to pay a gouging telco monopoly an extra hundred a month just so that Facebook usage didn't clog the pipe was annoying. Also I think a lot of their balancing of the load was forcing far too much work onto the client side.

    5. Re:PHP by Anonymous Coward · · Score: 0

      Facebook did it on PHP, until they had scalability issues...

      Then hired very smart people and did HipHop (a alternate PHP executor that "basically" converts your code in native code).

      They are also adding / proposing nice additions to PHP (which i hope get adopted, like strong typed option - which will speed up all var access hugely, among others)...

    6. Re:PHP by Anonymous Coward · · Score: 0

      I'd be more angry at the Telcos than Facebook. Google can sell the people in Kansas City a 1Gbps internet connection for $70 per month, while we have to pay $105 per month for 10% that connection speed at home and $4000 per month for 10% that connection speed for a business.

  23. Re:HAHA Ha ha ha haha. by Anonymous Coward · · Score: 0

    I know! He can fire up a Kickstarter page and jump right to "Profit!" without having to do a thing. Well, depends on how fast he runs...

  24. Do it first, do it "right" later by Anubis+IV · · Score: 3, Insightful

    If you're aiming for as many users as you say, then it'll take awhile to get there and you'll have plenty of time to hire folks along the way. At that point, you can go ahead and worry about re-architecting everything. First things first though, especially if you're by yourself: get it up and running with whatever technologies you do know. Once it starts to take off, you can hire people to rewrite it and redesign it around best practices.

    It's not the simplest path, but without bringing in outside investors who'll have the capital to allow you to hire the team it sounds like you need, I don't see what choice you have.

    1. Re:Do it first, do it "right" later by TCM · · Score: 3, Insightful

      You mixed up your wording.

      Do it right the first time, optimize for speed later. You don't want to find out you're unable to optimize because the design is flawed.

      --
      Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
  25. Re:HAHA Ha ha ha haha. by crutchy · · Score: 1

    if you were really laughing that hard at tfa, you really need to get out more

  26. Algorithms matter more than languages by chrylis · · Score: 2

    While there are a number of good tools out there for working with scalability, more important than any particular tool is building your application in such a manner that it's easily parallelizable. In a Web app, a core principle to keep in mind is that the more stateful the application server-side, the more difficult it is to scale, and so designing your application tiers in such a way as to decouple requests is key. Limit the amount of session state the server has to keep track of, and you'll be able to load-balance request handling smoothly.

  27. Some research material by leonardop · · Score: 5, Insightful

    I salute you for your ambition and determination. I hope you get to realize your vision.

    Now, as I read your question, I remembered an interview I saw a few days ago with Ben Kamens, one of the engineers working at Khan Academy, talking about scalability and things like how they manage their operation and the spikes of growth they have experienced in the past. It's a little light in technical details, but you may find it interesting: Root Access: How to Scale your Startup to Millions of Users.

    One thing I'd like to mention is that when you hear someone else talk about the things they've done and how they have done it, it's easy to see it as an advertisement for a particular technology platform (AppEngine and other Google machinery in the previous video, for example), but that's not the thing to focus on. Whatever choices other people have made, the good thing is that their advice can be useful no matter what choices you end up taking. I know this seems like such a trivial thing to say, but evidence suggests that a number of people miss this basic concept, and then discussions quickly degenerate into pointless noise about concrete technologies, instead of the ideas.

    I'd also recommend that you pay a visit to Google Developers youtube channel and type something like "scale" or "scalability" in the little channel search box. You might learn a few things from some really smart people who have confronted very real situations regarding scalability.

    Best of luck to you, my friend.

    1. Re:Some research material by Anonymous Coward · · Score: 0

      THIS! No one wants to know your opinion on how the OP will fail.
      What the OP asked for was not your opinion on whether his idea would get a million users, rather, which technology he should use so that if and when it happens, he is prepared. Instead you have Slashdot giving their opinions on whether he will succeed or not in even getting a hundred users (all the while without even knowing his idea!).
      Slashdot is such a negative place to be. Possibly by 50 year olds who have tried things and failed and generally have a negative disposition to everything.
      Hacker news is so much better as they seem to be 30 year olds wanting to change the world

  28. Focus on your idea, your businesss!!! by Anonymous Coward · · Score: 0

    Dont focus on your millions of users, you probably will never get there.
    If you start out spending a coule of years writing intricate frameworks
    for scaling and performance.

    Make an MVP, show it around, see if you can get any interest in it.
    Get your idea out there as soon as possible.

    If it takes off, if you get good growth, you should be abel to raise
    money and create the ultimo implementation

  29. one book that may help by Anonymous Coward · · Score: 0

    I recommend the book "Scalability Rules: 50 Principles for Scaling Web Sites" by Martin L. Abbott and Michael T. Fisher. It isn't specific to any platform, but I found it quite useful when I first started designing for scalability.

  30. OT: "why not" by Anonymous Coward · · Score: 1

    Just a little advice for you, take it or leave it: when offering ideas try to state them in a positive constructive manner and stay away from negative phrases that put people on the defensive, like "why not." Anecdotal: when I was young and my managers called me a prodigy programmer half my colleagues absolutely hated my guts, that is until someone pointed out to me that I always "told people 'why not do this' and 'why not do that'." So I trained myself in avoiding those phrases and presto, I became a much-liked prodigy.

    1. Re:OT: "why not" by berashith · · Score: 3, Funny

      Great advice. I see they helped you remain humble also

    2. Re:OT: "why not" by Anonymous Coward · · Score: 5, Funny

      This. I have found it's best to avoid phrases like "why not shut the fuck up," "why not eat shit and die," and "why not stick your unsolicited advice up your ass." People do not react to these phrases as positive suggestions as intended, and they immediately go on the defensive. Instead, try sarcasm.

    3. Re:OT: "why not" by Anonymous Coward · · Score: 0

      That anecdote gained nothing from you repeatedly pointing out how much of a "prodigy" you think you were.

    4. Re:OT: "why not" by Anonymous Coward · · Score: 0

      Perhaps you've misunderstood his sentence. Maybe he worked as a programmer for Prodigy until they got bought by SBC. Calling him a Prodigy programmer would then be sarcastic.

    5. Re:OT: "why not" by Hognoxious · · Score: 2

      Instead, try sarcasm.

      Brilliant idea, that's sure to work.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  31. Use ScaleEngine by Anonymous Coward · · Score: 0

    It's BSD-based, and the owner is Canadian, what could possibly go wrong?

  32. Resistance is futile. by Anonymous Coward · · Score: 0

    We are the Borg.

  33. Appscale by Instine · · Score: 1
    --
    Because you can - or because you should?
  34. make it work first by Lehk228 · · Score: 1

    make it work first, unless what you build the first time around is really an unholy mess you will be able to scale and upgrade as you grow much better than you can predict future hotspots on a system that isn't even running yet.

    --
    Snowden and Manning are heroes.
  35. Easy scaling by jbolden · · Score: 1

    You haven't said anything about the problem. If you want ease of scaling go with a pure functional language. Functional languages will force you to isolate state issues. Isolate state and you can operate in total parallel. But... generally you have shared objects which are mutable across the users. So you'd end up with very little meaningfully isolated. Those shared mutable objects are what is creating the scaling complexity. The language or technology doesn't solve that, though it can make the solution easier to implement or harder. You're going to need to architect around passing them around and verifying.

    But at least it will get you thinking about the problem the right way.

  36. Lean Startup by Anonymous Coward · · Score: 0

    It ain't what you don't know but what you know that ain't so. Lean Startup may help with how to scale based on actual user needs and desires as well as ways to help figure out what those really are for paying customers.

  37. Try the CQRS design pattern by Lairdsville · · Score: 2

    I suggest you look at the CQRS pattern. A good Java implementation is http://www.axonframework.org/. The advantage is the CQRS pattern that it is fairly simple, but highly scalable. So you can start small and simple with the confidence that you can tweak and optimise in the future to scale as required. There are good tutorials and support too. My team is using it for an industrial application and we have found that it has been very robust. It might take a bit of work to get your head around the concepts, but it is worthwhile in the end.

    1. Re:Try the CQRS design pattern by Ice+Station+Zebra · · Score: 1

      If there famework is so good, why does their web site run Apache and PHP?

  38. Wt by paugq · · Score: 1

    I like C++, therefore I use Wt for webapps. Great performance and scalability, great for embedded systems, great for huge systems, great when using third-party libraries (you can use any C or C++ library), etc.

    1. Re:Wt by dave87656 · · Score: 1

      Interesting. I was wondering if someone else would recommend C++. I've been using for a small app because of available libraries and scalability. If I had to write a web-based app and I had to be sure it was fast, I would use C++ and possibly fastCGI.

  39. Re:Java is SLOW by Anonymous Coward · · Score: 0

    You have no idea what you are talking about: http://benchmarksgame.alioth.debian.org/u32/benchmark.php?test=all&lang=java&lang2=php

  40. Idea for web app - idea for business model? by Anonymous Coward · · Score: 0

    Just wondering, do you also an idea for a business model? Is there anything that prevents others from doing just the same?
    I mean, having an idea for a web app is great, tinkering with scalable architecture is fun, but do you have a plan how to actually make money?

  41. Danger, Will...danger...son? by Anonymous Coward · · Score: 0

    You're using too many analogies, parables, etc..., and my marketing meter is going off.

    Please state your hiring thread as such.

  42. Re:Java is SLOW by Anonymous Coward · · Score: 0

    Java is slow*, don't go that route if you are planning a large userbase. There is a reason huge traffic site's don't use it. Facebook, Yahoo, and wikipedia use php, Google/Youtube use python (and strait C).

    Stop right there. This is is just wrong on so many levels.
    And besides, the programming language does not matter at all for scalability. Architecture does.

  43. Azure by akb · · Score: 1, Interesting

    Sounds like you want a PaaS provider that doesn't lock you in to a platform. I have a similar problem to you (PHP not Java) and I rejected AppEngine for the same reason as you. To my surprise I am leaning towards Azure, Microsoft's cloud offering. Their website service allows you to write your web app in a few different frameworks without having to customize it for their platform and then only pay for what resources you use. Management is as simple as manipulating sliders to how many resources you are willing to devote to your app and are willing to pay for.

    I have no interest in configuring VMs, configuring memcached, handling load balancing etc. My needs are simple, very basic PHP and Mysql. Traffic will probably start small but hopefully will spike big, but maybe it won't. Azure lets me handle this situation with a minimum of effort and expense. If they raise their prices or start to suck I can easily move my app since its simple PHP.

  44. Re:Java is not slow for webscale by Billly+Gates · · Score: 1

    In a web app the bottleneck is the I/O and latency speed of your SQL Database. Not the execution speed hogging the CPU.

    If this were an issue then how did PHP become so popular? Unlike Java is it is fully interpreted with the exception of some mods running in the web server engine.

    For this you need a platform that has tons of mods and apis to build upon, frameworks, and great interconnectivity to RDBMS and NoSQL engines, and awesome threading support. While webapps are not CPU bound compared to graphics rendering they are highly threaded where you can have well into the thousands different threads and processes all using small tiny bursts of CPU activity all at different times.

    Thus, Linux is insanely popular for this reason as you can move things to other servers easily like hits in a cluster or switch and Java is an excellent enterprise language to do something big and expandable for that reason. Maybe a little much for something small but works for something big. If you know anything about high traffic you would no your performance is not on the cpu at 100% but rather the load on the machine.

  45. Re:Java is SLOW by Anonymous Coward · · Score: 0

    This is wrong wrong wrong.

    Google uses Java. Apple uses Java. Amazon uses Java. Yahoo uses Java.

    You have absolutely no idea what you're talking about.

  46. do everything very simply by roman_mir · · Score: 1

    Don't overdo anything, don't overuse any frameworks.

    Actually if your idea is good and takes off and attracts investors, you'll be able to change your technology as you go, that's what everybody goes through anyway.

    Do a simple set up, as simple as possible, don't try go figure out how to parametrise everything and create nice administration interfaces, actually hardcode a bunch of stuff because that's the fastest way to do something and you will have a LOT of stuff to do if you are starting from scratch.

    As you go you will be able to replace components, just ensure that you actually have components. Ensure that you have layers and components by standardising your approach. Common tasks go into common layers, components are boxes, that are fit together with APIs.

    Start by making it as simple as possible and that will also allow you to keep it relatively fast. Keep as much data as possible in memory so that your database executions are minimised.

    Without knowing anything specific about your idea, that's the general advice that can be given, there isn't any information on whether transactions are important or not, whether it's supposed to serve huge amounts of static data or whether it allows users to communicate with each other in real time or whatever, so if you want better answers you should ask more detailed questions.

  47. Re:HAHA Ha ha ha haha. by Frosty+Piss · · Score: 0, Troll

    My friend, you are 18 and in junior college, right? It shows.

    --
    If you want news from today, you have to come back tomorrow.
  48. Scaling by LordThyGod · · Score: 1

    Any of the cloud providers are great for this. You can start with a free micro image from Amazon maybe during development phase if you have to start dirt cheap, and go up from there. Any of the cloud providers will let you scale as far as you need. That part is a no brainer. "Thousands of users" is a little vague. Depends totally on how many of them are active at the same time and intensive is what they are doing. I would think potentially something like a small 1 gig image might handle this in the low end scenario (not everybody interacting simultaneously). That does not sound scary. What ever development stack you are most comfortable with should work. I don't see why Java would be a bad choice. Its probably not the first choice of many.

  49. Re:Java is Awesome by codepunk · · Score: 0

    Java is awesome for contractors and systems engineers since it takes a virtual ass load of machines to run it. Job Security!

    --


    Got Code?
  50. Re:Java is SLOW by the+eric+conspiracy · · Score: 3, Informative

    Many sites with very very large userbases use Java extensively in their stack. Including eBay, PayPal, Amazon, Tumblr, LinkedIn and Google.

    Millions of page views a day is a small to medium e-commerce site. I was doing a million with Perl back in 2002 on a two CPU 1U machine.

    Tumblr gets something close to a billion, as does anyone in the top 100.

  51. Touch the ground right now: 3 things by Anonymous Coward · · Score: 0

    1. Such things is what the clouds are for. If you cannot scale it at least initially using clouds (for the parts that scale beyond reach) at per-user cost significantly below per-user income it's not a brilliant idea at all. Might still be workable but it is not brilliant in the way you think.

    2. Do you have a working implementation? If not it might be brilliant but it's just an idea. Most ideas, brilliant or not, stop being interesting once they meet the realities of implementation, even the very best ones. This is both the beauty of dirty hacks and why code usually only barely works.

    3. If 1 and 2 leaves you dismayed then look at what you have and prioritize. This might mean that you prioritize the idea away completely and do something else that does fulfil 1 and 2.

  52. Real hardware by Anonymous Coward · · Score: 0

    Unfortunately you can't _really_ know how much demand "thousands" of users will impose on a real set of hardware until you've at least partially implemented it.

    Make sure your design is sufficiently decentralized (peer nodes or a master/slave design) and that the nodes are able to communicate across physical nodes (whether by normal network communication, MPI, a common network-accessible database, network filesystem, etc), at least when you approach your limits you have the option of throwing more hardware at it for a while, before you have to worry about major rearchitecting of the software.

  53. Counting bytes by Anonymous Coward · · Score: 0

    I work for a big international company and the key factor we discovered to achieve high performance in Java is reducing the total number of bytes allocated and garbage collected per request. As long is that number is south of 2 MB, you can manage to cater for 1000+ concurrent users per server instance.

  54. kiss by crutchy · · Score: 4, Insightful

    keep it simple stupid

    the more complex you make the app, the bigger the load on your infrastructure and bandwidth

    if you follow google's lead, they developed everything in house. same with pixar, which develops software to handle very high end graphics performance, and even linux started off by taking a problem and solving it with a home grown solution

    if you want a specialized application to handle that many users without running into software performance issues (nevermind server infrastructure and bandwidth, which can probably be gradually improved), you want to make it efficient... so you will probably need to develop it yourself

    if you use off the shelf packages like wordpress and the like, they are full of all sorts of features that you might not need but will still pay for performance-wise

    many people will try to tell you that there is no point reinventing the wheel and that existing wheels will always be better than anything you can come up with, but they are full of shit. if everyone stuck with that ideal we would all have wooden wheels on our cars. there is a lot of merit in reinventing wheels, not only to make better wheels, but in understanding wheels to learn how to better use them. be a little selective about where you want to start customizing from... i wouldn't recommend reinventing the operating system, although google did (based on the linux kernel) and they are reaping the rewards of a more efficient search platform than might otherwise have been possible.

    if you're handy with microcontroller programming you might be able to make a pretty efficient microcontroller-based server cluster, sort of similar to what HP is doing with their new SOC blade technology. microcontrollers and SOC are the future, so if you want to get involved in future tech today, pay attention to what is going on with ucs... a simple example is sheevaplugs and its derivatives. this is also where linux probably has a major leg up on windows because microsoft has been so focused on the x86 platform that (even with the recent release of WIndows RT) they are lagging a ways behind linux in multi-architecture support (have to wonder how much of the linux kernel has been plagiarized in WinRT).

    other things that affect scalability and performance include the efficiency of algorithms... if you haven't done a CS degree, go onto youtube and watch lectures on data structures and algorithm optimization. there are free CS lecture series from MIT and UNSW that I know of. Richard Buckland of UNSW also makes the lectures a little less boring with his antics.

    how you develop your app will also depend on your goal to get 100,000+ users on the site...

    security is probably the hardest and most significant hurdle you'll face... if you fuck security up (either the app isn't secure enough or it's a pain in the ass for users to authenticate) then your app will be a flop

    you also need to think like a user, not like a developer... this is probably where having a small team will help at some point (a few eyes with different perspectives)

    many developers fall into the trap of developing software that is easy for the programmer and thinking that the user will get used to it... which is fine if you have a monopoly. unfortunately by the time you have 10,000 users, your idea will be copied to create competition, and if they do a better job with the user experience you're dead in the water.

    make sure you are standards compliant. use the HTML 5 and CSS 3 validators, but i would recommend avoiding features that aren't also in HTML 4.01 and CSS 2.1 until HTML 5 and CSS 3 become fully implemented and debugged. the exception would be that if you want a feature that would otherwise require flash or java, use html5 instead of flash. if you want 100,000+ users, don't use flash or java!

    i would use a linux distro such as debian with all the fat trimmed. it should be obvious, but don't use a WISA stack.

    keep your service clear of advertising, 3rd party cookies and any 1x1 hidden iframes. don

    1. Re:kiss by Anonymous Coward · · Score: 0

      3rd party stuff doesn't impact much on your infrastructure, but on the usability / user browser.

      Besides, some may be required to keep the bills paid...

  55. Consider using an external indexing engine by juliohm · · Score: 2

    Regardless of which language or platform you use, a common bottleneck for web applications is the database resource. Most developers don't take large scalability into consideration when building the service architecture. If you plan to scale large in the future, I recommend you stop thinking of the database as the main source for all queries in your system. The basic idea is that costly and complex queries/searches can be given to an external scalable service. Take for instance, the Solr project (http://lucene.apache.org/solr/) which is a third party indexing tool that can be easily integrated with any other platform. You can design your system's database with the basic table relationships with primary keys, foreign keys and the occasional index. Any more complex table relationship, queries and searches can be delegated to this external indexing service. It will index whatever data you give it, in whatever manner you need, and return a list of results for you to easily find primary keys for direct access to your system objects. Think of it as your own personal Google indexing service... Solr is an Apache open source implementation. Once you understand this concept, you can keep you application's internal database very lean and simple, with just enough indexes and primary keys to get instant access to entities.

    --
    Julio Henrique Morimoto juliohm@gmail.com
  56. GAE - not all bad by sootman · · Score: 1

    Google App Engine apps can be written in Python 2.5, 2.7, Java, or Go. If you ever want to move it to something else, I think you can just change the way it communicates to the new database -- the rest should be pretty portable.

    --
    Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
    1. Re:GAE - not all bad by DuckDodgers · · Score: 1

      AppScale is an attempt at an open source re-implementation of Google App Engine. So if you don't like something Google's doing, you have an option besides completely porting your application to another infrastructure.

  57. One suggestion... by djbckr · · Score: 1

    You have a lot of good comments so far, but none particularly directed to your specific question. I have recently come to a framework that I *really* like, coming from a similar background to yours.

    node.js (server, business logic)
    nginx (web server, proxy to node for business logic)
    postgres (For relational/transactional data. There's a nice node.js driver for postgres)
    mongodb (For larger datasets that don't need the transactional stability or quite so structured data)
    Angular/Bootstrap with some jquery for good measure on the front-end

    This is a lean-mean server tech that blows away any Java (JEE) framework. It took me a bit to get up to speed on node and the way you program with it, but I love it now. It scales so much more on fewer machines than JEE can even dream of. For what it's worth, you should look into it.

    1. Re:One suggestion... by Ash-Fox · · Score: 1

      Regarding your recommendation of mongodb, I am interested in hearing how you counter the arguments in this article and why the other solutions are worse.

      --
      Change is certain; progress is not obligatory.
  58. The Odd One Out by Anonymous Coward · · Score: 0

    And what would one who programs do with an idea with no friends who are interested in programming?

  59. Don't worry about scaling at first by Anonymous Coward · · Score: 0

    Like lots of others have already told you. Build it first and worry about scaling later. Having said that I thing the Udacity lecture series Growing Reddit is pretty interesting but don't let it distract you from getting something up quick. http://www.youtube.com/playlist?list=PL7761FCF889E7D36D

  60. Based on experience by blkmajik · · Score: 2

    Based on my experience at a fortune 100 company with a heavy interest in Java. Don't use Java. Use PHP or LUA as a cgi. Your sysadmins who have to keep your application up will thank you.

    • Do not use java. To make it work rigth you have to go against everything the community says you should do.
    • Do not use NFS
    • For file storage use something like MogileFS. It is not likely the best, but it's a proper example of what you will want
    • If you use a database you MUST understand and use the relational aspects of things. If you use the database as just a key:value store I will personally beat the ever living shit out of you.
    • Use loose coupling and sharding of your data. Multiple databases on multiple replicated servers is happy. Isolate each aspect of your product into separate databases.
    • On a related note it's likely that only accounts need to be replicated between databases. It's not hard and will allow you to scale very large
    • If you use memcached do not store individual bits of data. Store complete rendered data only
    • Use cgi. Mod_* and Java are a bitch to debug. Php and lua work well for this. If you have something that is multi-tenant multi-version this applies even more.
    • Do not use web session affinity.
    • Do not use full text search in a database
    • Do not use stored procedures
    • Do not use large frameworks. If you must use a framework use small ones dedicated to a small subset of functionality. No framework you use should use a database.
    1. Re:Based on experience by Anonymous Coward · · Score: 0

      for the most part do not store files in database records just store the path/link - filesystems are surprisingly good for files.

    2. Re:Based on experience by bored · · Score: 3, Insightful

      If you use a database you MUST understand and use the relational aspects of things. If you use the database as just a key:value store I will personally beat the ever living shit out of you.

      Like all simple rigorous rules. This is sort of bad advice in a lot of circumstances. Sure inventing your own hashing function and using the hashes as the keys in a relational DB is stupid. That said, focusing on the main relationships with your tables, and not trying to describe every single edge case will massively simplify the schema. Plus, there are tons of little pieces of information that often need to be persisted, that just don't tend to have any kind of obvious relationship to anything else in the schema. Being able to add key:value attributes on the fly in the code without screwing with the schema can be a huge bonus to initial productivity. Sure if at some point you discover common, frequently used attributes, or you have some kind of performance issue because your reading some value out of a key:value store frequently then by all means fix it.

      All that said, I'm not really a fan of trying to eak performance out of a databases. Use the database for what its good at, complex relationships, and easy storage/retrieval of information. But if your app is trying to do 500k updates per second to a single table, its probably a better idea to seek alternatives rather than throw a bunch of money at database hardware. I have my own mental rule, is this code path going to be a hot one? Yes, then no database queries. There are a ton of strategies for moving the queries/updates out of the paths that are performance sensitive.

  61. Re:Java is SLOW by Frankie70 · · Score: 1

    Yes, one should chose PHP, Ruby, Perl and Python over Java because Java is not compiled code. I think you are a moron. PHP, Ruby, Perl and Python aren't compiled code either.

  62. API first by foniksonik · · Score: 3, Informative

    Write your public and private Apis first. Then implement them quick and dirty. Get feedback. Get users. Keep working on the API to make improvements. As you get more traffic hire good people to reimplement those same APIs on a better tech stack. Runs and repeat. You can even mix and match platforms, just use a smart routing proxy like HAProxy to send requests to the appropriate places. Static files go to a CDN, logins can go to something small but secure, high volume requests can go to a big cluster or IaaS like Amazon or Google for on demand scaling.

    API first.

    --
    A fool throws a stone into a well and a thousand sages can not remove it.
    1. Re:API first by Anonymous Coward · · Score: 0

      Yes.

      If 30 years of software development has taught me anything, it's to define your interfaces first, worry about implementation after.

    2. Re:API first by c0lo · · Score: 3, Insightful

      Write your public and private Apis first. Then implement them quick and dirty....

      API first.

      So true, it can't be stressed enough. Supplementary:

      1. when considering API-s, consider them in term of service interfaces: even better if these services are stateless.

      2. implement the services as different processes, exchanging data in whatever serialization format you fancy (Java serialization, JSON, Google's protocol buffers). Use the quick-and-dirty for their first cycles of implementation: as long as you maintain the interfaces unchanged, one can later come and re-implement them better.

      3. pay attention to what needs to be shared across the whole system and what can be divided/partitioned on different hosts.
      E.g. highly probable that "subscription info/user identity/login services" may need to be supported by a single "database" but, once the user finishes the login, she gets her data from a storage hosted else, supported by whatever later development cycles would find appropriate (of course, at later stages, one will need to implement a "registry" mapping a user identity to where the data is stored. But the first implementation can use a single database for the data of all users as long as you do not tie in the login service with other services

      --
      Questions raise, answers kill. Raise questions to stay alive.
  63. Re:Java is SLOW by tgetzoya · · Score: 1

    You are very wrong.

    Compared to PHP, Java is much faster when it comes to business logic. A combination of PHP for user-facing material and Java for business logic and database interactivity is the best of both worlds solution.

    Facebook uses HipHop (http://en.wikipedia.org/wiki/HipHop_for_PHP) to convert PHP to C++ to gain some speed advantages. Also, they use a database back-end called HBase (http://hbase.apache.org/) which is written in Java.

    Your 1200 users are not the hundreds of thousands the poster is looking at/for. For this, I recommend Java with Cassandra or HBase in the back end. Cassandra (http://cassandra.apache.org/) and HBase are both NoSQL and are limited only by disk space availability. Postgres/MySQL/*SQL can also be used when relational information is mandatory. At work we did some research and found that for large data sets, NoSQL write speeds were an order of magnitude faster than Postgres/*SQL. Those writes included encrypting the data before writing to the database. Read speeds were on par with each other.

  64. my advice: by buddyglass · · Score: 0

    Swallow your pride and go with App Engine. Here's my thinking:

    1. GAE eliminates a lot of otherwise time-consuming setup work. You're just one guy. If you want to ever get this thing finished you need to spend your time coding and not setting up servers, mucking around with database settings, etc.
    2. GAE is scalable w/ little to no extra effort on your part, assuming you code your app correctly.
    3. GAE solves the problem of serving a user base that's geographically distributed all over the world.
    4. GAE lets you keep coding in Java.
    5. GAE is pay-as-you-go. If nobody visits your site you're not out a lot of money.

  65. Node.JS + MongoDB + Geolocation DNS by corychristison · · Score: 0

    Without any actual information on the project, this is my recommendation... MongoDB is designed for clustering and replication of various types of Data.

    Node.JS scales fairly well and is pretty light weight.

    With Geolocation DNS you can start small in your local area hern add servers in places you need to.

    1. Re:Node.JS + MongoDB + Geolocation DNS by Anonymous Coward · · Score: 0

      lol node.js.

    2. Re:Node.JS + MongoDB + Geolocation DNS by Ash-Fox · · Score: 1

      MongoDB is designed for clustering and replication of various types of Data.

      I am interested in hearing how you counter the arguments in this article please.

      --
      Change is certain; progress is not obligatory.
    3. Re:Node.JS + MongoDB + Geolocation DNS by corychristison · · Score: 1

      I was not aware of these issues. Thank you for pointing this out to me.

      My experience with MongoDB hasn't been much outside it a fairly small personal multimedia indexing database. In reality I could probably have gotten away with a flatfile database but I like the OO nature of MongoDB.

  66. Nope... call Tata, and focus on your business by Anonymous Coward · · Score: 2, Funny

    One company I work for (until I found a sweeter place elsewhere) got rid of their entire dev staff except for the top level designers. An offshore dev team gives guarenteed results, low bugs per line count, and actual contracts to say that. As an added bonus, the parking garage doesn't smell like BC bud anymore.

    You might give them, or another offshore place a call. They may be able to get what needs done, with little QA, for pennies on the dollar than it costs to hire people locally.

    1. Re:Nope... call Tata, and focus on your business by meustrus · · Score: 1

      Outsourcing is a terrible idea. I assume parent is being sarcastic about "guarenteed [sic] results, low bugs per line count". East Asia outsourced software developers are famous for producing buggy code, or the wrong code, past deadlines.

      --
      I sometimes ask revealing, often ignorant-seeming questions. Maybe they're harder to answer than you think.
    2. Re:Nope... call Tata, and focus on your business by Malenx · · Score: 1

      So they off-shored from India to America?

    3. Re: Nope... call Tata, and focus on your business by Anonymous Coward · · Score: 0

      I half wonder if you're talking about where I work.. Does it start with a T?

  67. Modularize the whole thing by i.r.id10t · · Score: 1

    Break your whole project down into little tiny modules, with a configuration file to provide various host names when you start breaking things up onto different hosts. And, where possible, use wrapper functions for things like DB calls.

    This way when you move from mysql to postgres to oracle to NextBigDBPlatform you change the one wrapper function, not every part of your code. When you see that Java isn't the best tool for a particular job, re-write the small module in charge of that job in some other language. As long as it gives the same output with the same input, who cares?

    Basically, build yourself up an API. Let the individual building blocks pass info back and forth thru either other API calls or platform and language neutral, well established communication protocols like HTTP puts/gets/posts using curl or whatever method you like to use to generate and then deal with the output of such requests.

    --
    Don't blame me, I voted for Kodos
  68. Red Hat's OpenShift by James+Manning · · Score: 1

    IMHO, rejecting Google App Engine at this stage is a bit myopic, but it's your choice. :)

    Among the more open-standards-focused cloud offerings, there is Red Hat's OpenShift

    - https://www.openshift.com/
    - https://www.openshift.com/developers/java
    - https://www.openshift.com/developers/pricing
    - http://www.jboss.org/openshift.html

  69. Java? *twitch* by Anonymous Coward · · Score: 0

    I'm in the middle of porting an Atlassian product.

    They are probably some of the best Java apps out there, and this port is a NIGHTMARE.

    The easiest ports are PHP.

    Start with something that you can write well. If you are good at Java, then use Java. If you are good at PHP, then use PHP. If your seed project sucks, then forget about scaling. It won't be able to leave the crib.

    I use PHP, not because it's a good language (it isn't), but because it is probably the single most portable host-executed dynamic language out there. It is the "P" in LAMP.

    If I want my system to be portable, extensible, scalable and installable by as many folks as possible, then I use PHP. If I have a completely bespoke server, that I'll never move from, then I have a lot more choices. It is pretty difficult to get decent Java hosting, BTW...

    PHP is pretty fast, these days. There are a number of ways that bad coding can will slow it down (for example, using it to build vast MVC frameworks as if it were Java), but a great deal of FaceBook is written in PHP, and they have chosen to make the language faster, as opposed to rewriting their code base. They have done something similar with MySQL.

    1. Re:Java? *twitch* by Anonymous Coward · · Score: 0

      I haven't tried them, but Heroku, Cloudbees, Red Hat OpenShift, and Google App Engine all do Java app hosting.

  70. Re:Java is SLOW by Anonymous Coward · · Score: 0

    Oh, my god.

  71. AWS Elastic Beanstalk by mejmt · · Score: 1

    Amazon AWS's Elastic Beanstalk service is perfectly suited to this sort of problem. If you do your homework and design your system properly, it can automatically scale from a single box up to a giant group of servers capable of handling as many users as you can muster. It's pretty magical. We use it for everything we build, just in case.

  72. Re:HAHA Ha ha ha haha. by VortexCortex · · Score: 1

    No, I'm 30-ish and have worked for companies where doing such things is fucking part of business. Go to Linode.com. DONE. Hell, anywhere can solve this problem of hundreds or thousands. It's called fucking load balancing. Get a few Casandra nodes running. USE FUCKING GOOGLE to search for the answer. Look into how OTHER COMPANIES pull it off... Coming to slashdot? For fucking serious? Yeah, I laughed. They fucking made me do it.

  73. GAE by Anonymous Coward · · Score: 0

    I know you stated you weren't interested in GAE but this is exactly the scenario it excels at. You have a large globally distributed intrastructure that can scale massively at your disposal. And as long as you stick to the high level APIs like JPA instead of coding directly to the datastore API, you have an app that you can pull out and host elsewhere if you want.

    Now another thing to consider - you said you have experience with GWT, but the trend I've noticed with Google is they're not building much on GWT at all, and I'm not sure what the long term plans for that project are. But I've spent a lot of time battling weird bugs / misconceptions when working with it, so I'd be inclined to use something more mainstream like Ember.js or other solutions. But if you're good with GWT, it's hard to beat the GWT + GAE integration.

  74. In Other Words... by Anonymous Coward · · Score: 0

    Pick up a copy of "How to Build a Porn Site" by Tim O'Reilly, since that is obviously what you are trying to do.

  75. Yes! Think of scaling now by Anonymous Coward · · Score: 0

    I see a lot of posts around the topic of "Just use PHP/Rails/etc now, worry about scale later." I disagree, it doesn't take THAT much more effort to think about scaling up front, and develop the correct solution now, so that you don't have a (or multiple) panic re-orgs. The key is getting something that scales from small to big somewhat effortlessly. Nothing will scale perfectly. So here's my recommendation:

    1. Use RackSpace - I've seen this mentioned before. And I agree, RackSpace or some other OpenStack provider will ease your IT pains as you grow, plus you can start small.

    2. Use Nginx - It is as easy to setup as Apache (some say simpler) and can connect to most anything apache can. A micro-sized VM from Rackspace (256MB RAM) is more than enough if this is the only thing running -- and it should be. I actually perfer the OpenRESTY build of Nginx, the extra modules allow for some nice improvements over the vanilla system. For example, off-loading your session management/timeout/login to the webserver so that the backend doesn't have to do it.

    3. Use Java - Two reasons, first, use what you know. You are going to be the only programmer to start, so it'll be the fastest way to get to market -- VERY important with your new venture. Second, While the fancy "Java Extensions" like Grails, Scala, etc have some nice features, it's just syntactic sugar in the end. Why add a layer of indirection, also there are more Java developers than Scala developers so your talent pool is larger. If you want a framework to help you get going quickly, use Apache Camel. It's an excellent framework for handling traffic from all sorts of places, they not only support HTTP/REST, but SMS, Message Queue's, Email Systems, IRC, and a lot more.

    4. Use Static HTML , JSON, and AJAX. Template languages are soooooo 2008. To get tha tweb 2.0 appeal (and scalability) you cannot have template engines chugging through producing HTML on the server. Use your Nginx to serve static page, call your java servers to get the raw data (in JSON) and use a client framework like jQuery to produce the output on the browser. If you are going to scale to tens-of-millions of users, you are going to need to push as much work to the client as possible. If you MUST use a template language use a light-weight one like HandleBars or Velocity.

    5. Use RIAK - I'm making an assumption that your social application is going to need the noSQL/document-oriented type database over the traditional SQL DB, if only for the sheer amount of data you will be storing. RIAK is simple to setup, simple to use in a small environment (1-3 servers) but has massive horizontal and vertical scaling options. We routinely push millions of records a day into it, and use it's built-in map/reduce functions to process the data as we need it. It's a bit memory hungry, I recommend a 4GB RAM VM at a minimum, but it will grow with you without having to re-engineer core parts of the system.

    So, in comparison to the LAMP framework of the past, I offer the ONRAC framework (OpenStack, Nginx, RIAK, Apache Camel) Not as catchy of a name, but exactly what you need to scale from a few 1000 users to tens-of-millions.

  76. Abstraction by Anonymous Coward · · Score: 0

    For your initial launch, write it simply so you can get it done. But plan high level abstraction into the design.

    For example, you could write your app with functions like QueryDatabase(sql_command), but that ties you to specific implementations. If you've instead got a function like database.GetPostsWithKeyword(keyword), then you can write your easy MySQL implementation now, and swap it later with Dynamo or whatever else the cool kids are using later, once you've got enough traffic to warrant that.

    Do high-level abstraction like this for enough parts of your app, and you can scale it one piece at a time when necessary.

  77. please reconsider using Google App Engine by keneng · · Score: 1

    Google App Engine rocks and supports many languages. You highlighted Java and I do believe Java was one of their first supported languages for App Engine.
    Google is open-source with their api and infrastructure. I'm not sure where you feel they are proprietary, but they are the most open-source company I have experienced. In fact, I have recently examined the Google App Engine docs and found them enjoyable and easy-to-use and especially with Google's new Go language support api. Everything about golang is free and open-source. It compiles on your ubuntu box if you have one and then you develop the app locally on the ubuntu box. How much more open-source do you need?

    The scaleability is the fact that when you push the app to the google infrastructure all the scalability is handled transparently by Google. It keeps you focused on the problem rather than tripping your feet like you are still deciding what tools to use and what companies to collaborate with.

  78. Start with scalable technologies! by MarkRose · · Score: 5, Insightful

    As someone who has written an application that scales to over 1 billion requests per day, let me offer my thoughts.

    Scaling your application should be as trivial as launching more application server nodes. If you can't add/remove application nodes painlessly, you've probably done something wrong like keep state on them (this includes sessions).

    Don't worry about scaling your application layer at all (within reason). You can always throw more machines at the application side in a pinch, and for a long while it will be cheaper to add servers than to hire someone. When your application servers are costing you more than a salary, hire someone to find the hotspots in the code and make them faster. Until then it's a waste of your time.

    Scaling state, aka your datastores, is where the challenge lies. You need to spend a large amount of time sitting down and analysing every operation you plan to do with your data. SQL is great for a lot of things, but you will eventually run into a point where heavy updates make SQL difficult to scale. Mind you, decent hardware (lots of cores, RAM, and SSD) running MySQL should scale to several thousand active users if your queries are not expensive. The Galera patches to MySQL (incorporated into Percona XtraDB Cluster and MariaDB) can give you true high-availability, but you will still have write-throughput limitations.

    I would also highly recommend you look into Cassandra (especially 1.2+, with CQL 3), which was built from the ground up to scale thousands of low end machines that often fail (if you can't tolerate hardware failure, you messed up). Cassandra is more limited in the kinds of queries you can execute, more relaxed with data consistency, and more thought is needed ahead of time. On the other hand, it can also be used for global replication, which is something you are interested in. At the very least, having a good understanding of its data and query model will open your mind to the kinds of tradeoffs that must be made to enabling scaling.

    Contrary to what others are saying, you are correct to think about scaling now before you even start! Doing a rewrite is costly and expensive in money and time. Why set yourself up for that? Planning for scale before you start is the best time! If you start with a scalable datastore like Cassandra, and structure all your queries to work within its model, it is no more work than doing things in SQL, and you're way ahead of the game!

    The most important part is spending time modeling how you will access your data. Think about how you'll avoid hot spots (which make scaling writes difficult), and think about how to make reads fast by reading as little as possible. Think about caching, and how you'll invalidate the cache of a piece of your data without having to invalidate caches for things that didn't change. (Think about updating on data ingestion instead of running statistics later.) If you can't avoid hot spots, make only small reads, and cache independently, you are not done.

    Good luck!

    --
    Be relentless!
    1. Re:Start with scalable technologies! by MacDork · · Score: 2

      This is the best response I've read in the entire thread. I just wanted to add, you are probably okay with SQL if you are familiar with that and you're expecting "thousands of users simultaneously." Postgresql 9.2 can hit around 14,000 writes per second. I'm sure MySQL is similarly capable. If you need more than that, then you have to have go with something like Cassandra.

      Netflix has demonstrated Cassandra can hit 1.1 million writes per second on Amazon's commodity hardware. You just have to be willing to sacrifice consistency to get it.

      Finally, curb your enthusiasm. When you first have an idea that seems big, it's easy to get carried away. Ask yourself, is this something I really want to work on for the next five years of my life? When you start to dig into the actual implementation, you're gonna get bogged down in details you didn't consider when you were so enthusiastic.

      At that point, you're going to either think about the problem day and night until you find a solution, or you're going to say "This is stupid anyway" and want to move on. It's harder to move on once you've told friends/family/investors/etc. After doing it a few times, you look like someone who never follows through. When you DO have a really great idea that's solvable, you're the boy who cried wolf.

    2. Re:Start with scalable technologies! by Anonymous Coward · · Score: 0

      > write-throughput limitations.

      I guess that's the main issue the OP was thinking about: Scaling the web front-end isn't hard, but scaling the database back-end is much tougher on social sites like he has in mind, where each page/query is customized for each reader.

      > Doing a rewrite is costly and expensive in money and time.

      Not just that: It can spell doom for the company if it's stuck for months trying to rewrite the application when it's stuck.

      Rick Chapman has several examples of what can happen if you rush and don't plan:
      "In Search of Stupidity"
      www.joelonsoftware.com/articles/Stupidity.html

    3. Re:Start with scalable technologies! by Anonymous Coward · · Score: 0

      As someone who works for a famous multinational, serving just 300,000 requests a day on an Google Appengine F1 (600mhz) class server, capable of scaling up to billions, yet currently requiring less than a handful of instances peak time I can tell you (standard warning here about prem. optimisation, but you do need to design for some of this, basically think your app must be read only, with only occasional write):
      Cache service db read responses
      Cache controller responses where possible
      If you can serve a static response, serve the static file, without any controller, e.g disclaimer,help should not be dynamically generated because you think it should read these from a db. This removes significant overhead.
      Disable sessions explicitly unless absolutely needed. Sessions cost performance and db.
      Minimise instance startup time, e.g warmup requests, explicitly not starting all servlets
      Avoid startup class scanning, its very slow relatively (Java)
      Avoid at as much as pos db writes, this is critical for cost and performance.
      Try to get away with single object writes as opposed to many object writes, e.g user has an address field, not a separate address record. This sounds horrible from a traditional SQL background but restructuring your db layer later when you have billions of records will probably be impossible due to cost in thousands of dollars meaning your app can never perform its writes more cost effectively and will always be non optimal. Conversely, if you find by some perversion you do need multiple objects, you can add these later at no cost except for creation overhead.
      Consider minimising controller count, group related code into fewer controllers this improves startup.

      There are others but these are tried and proven in fire cloud performance improvements. Appengine uses a key store (aka nosql) for a reason, and whilst awkward if you're used to traditional constraints, it allows much greater flexibility and arguably makes you a better programmer since you need to always assume constraint violations, if you had them, would appear in code by way of incomplete (non "joined") results.

    4. Re:Start with scalable technologies! by Common+Joe · · Score: 1

      I'm going to add a couple of articles I liked for your consideration. The articles and some of the technology are old, but the ideas are probably still sound.

      http://highscalability.com/amazon-architecture

      http://highscalability.com/scaling-twitter-making-twitter-10000-percent-faster

      http://queue.acm.org/detail.cfm?id=1142065

      http://www.webperformancematters.com/journal/2007/8/21/asynchronous-architectures-4.html

      I'd be curious to see responses to these since these articles are old. I haven't been to these websites in a while, so maybe they have some other interesting and more up to date articles.

      As for PostgreSQL, I'd recommend this book. The first few chapters apply to any database. The next relate to PostgreSQL specifically. If you're a code head (like me), this book may be better suited for the DBA, but we don't know your specific needs.

    5. Re:Start with scalable technologies! by Terrasque · · Score: 1

      As someone who hasn't written such a scalable app, but have been interested in the issues around it, this was more or less my first thought too.

      App part is usually just "throw hardware at it", but DB part can be really hairy. You can get far on a one machine DB, but once you need to go past that, you got trouble. Unless you've already taken that into consideration.

      Also, it's good to be familiar with the common problems and workarounds for scaling. Caching is one such thing. Often you got several possible ways to do things, and some of them lends themselves better to scaling than others. Often with little or no extra work. Or it's just a small tweak to functionality, hardly noticeable that's needed. But something that will be a royal PITA to change later on.

      So my advice: Get familiar with scaling problems and solutions, keep them at the back of your mind when creating the site, but don't go out of your way. Not yet, at least. Maybe take time to make easy "hook" areas in the code where you can insert scalability later (f.x design things so it's easy to move parts of it to RPC later on, and make it easy to later add caching to areas).

      --
      It's The Golden Rule: "He who has the gold makes the rules."
    6. Re:Start with scalable technologies! by AleX122 · · Score: 1

      This is my favorite answer so far. Thank you for some hints like when to hire additional help or pointing out Cassandra. I agree with many other responders that I should start simple and I will. However you caught my point that I want to prepare upfront for scaling and avoid rewrite. For sure at some point I will have to hire administrator who will set up proxies/dispatchers and configure servers. I do not know if it is a right direction, but network architecture I will plan later since it will be dependent of the software I am going to use. At the moment the draft of architecture looks as follows: javascript client side (GWT), json services - server side (java/scala), cassandra on backend

    7. Re:Start with scalable technologies! by MarkRose · · Score: 1

      The one other thing I missed is to also think of designing a service oriented architecture. Every role that your system has, such as authentication (I'd use OAuth2), should be its own service. By using clearly defined APIs, it will make it easier to replace pieces of your system with new ones (even written in new languages), and it will give you an interface to write tests against for your tests.

      --
      Be relentless!
  79. exactly by Anonymous Coward · · Score: 0

    n/t

  80. Insufficient Information by Jane+Q.+Public · · Score: 1

    OP did not tell us WHAT he needs to scale.

    Front-end is a given. But that is relatively simple to handle. What else? Database? Business logic?

    Kind of hard to give advice when you don't know what you're giving advice for.

    1. Re:Insufficient Information by Jane+Q.+Public · · Score: 1

      Oops. Never mind. He did too. Processing and data storage.

  81. Re:Java is SLOW by coaxial · · Score: 1

    Regarding FB's PHP: You do realize there's a reason why HipHop was written right?

    So why was PHP and MySQL chosen for FB? Because that's the only things that The Zuckster knew.

  82. It's called a desktop app by Anonymous Coward · · Score: 0

    For crying out loud, keep it simple; force the users to not have instantloadtimes nowaitbecauseI'mabraindeadfucktardlol.

  83. Windows Azure is one of the best choices by Anonymous Coward · · Score: 0

    Try windows azure. Super easy to build large scale applications.

    1. Re:Windows Azure is one of the best choices by PPH · · Score: 1

      My kingdom for a +1 Funny mod point.

      --
      Have gnu, will travel.
    2. Re:Windows Azure is one of the best choices by crutchy · · Score: 1

      the only tiny little problem with azure is that it is windows based... apparently there are linux guests available, but if there were also linux hosts available it would be much more popular :-D

      sorry Microsoft, but it doesn't matter how great you make azure... the "windows" brand is doing more harm than good in data center virtualization

      maybe if they just called it "Microsoft Azure" they might lose the negative connotation of windows, but even without a huge amount of mass-marketing linux has forged an unstoppable presence in the data center; it's pretty much a given that as a corporation reaches a certain size that linux will play an increasing role in server infrastructure due to benefits in security, reliability, scalability and TCO (the only historical competition in this regard has been from aix, z/os and solaris). regardless of how new Microsoft product releases actually compare to Linux, they will have to contend with this linux data center stereotype (particularly difficult for any product that brandishes the "windows" stigma)

  84. I'm building one right now by Anonymous Coward · · Score: 0

    I am building such a beast myself right now. I'm not using java, but what some of the other kids are using: linux, apache, mariadb, php. For more giggles, there is memcache (that's good for thousands more users), plus common sense stuff relating to how big a hit you put on your system w.r.t. data (big images suck bandwidth). Even clipping an image in several ways means showing part of an image here, another part there, but its only one image, and once down you can use it in dozens of places. The Facebook folk use memcache too, as well as APC, and then they created hiphop-php which is compiled php. So much for slow. Now they don't use a standard SQL database anymore (they are using nosql), but it depends on how big you get.... did you say 1 billion regular users? Personally, I would be happy with 2-3 hundred million. First you need the fancy idea though, right?

  85. Vendor Lock-in by afgam28 · · Score: 1

    It's cool that you're interested in developing to an open standard, but I think it's worth noting that there are two kinds of proprietary platforms.

    The first is platforms like Google App Engine or Windows. These platforms lock you in, by forcing you to write your code to a certain API. If you decide you don't want to keep using this platform, it's really hard to move to something else. The bottom layer of the platform forces a lot of implementation details in the upper layers of the system.

    Then there are things like EC2 or GCE. EC2 gives you a pretty standard Linux machine (unless you choose another OS) and you run standard Linux applications on it. There isn't much lock-in, and the bottom layer of the stack can more easily be swapped out. There are features of EC2 like auto-scaling that are available to you, but you don't have to modify your application when you move to a different auto-scaling implementation.

    It sounds like you're a software guy, and so getting someone else to manager your data center is probably a good idea for your situation. That way, you can focus on the software. You can always move platforms later if you decide Amazon is not doing a good enough job, or if you find a more open platform that is as good.

  86. First things first... by JWSmythe · · Score: 1

        First things first... Do you have hundreds of thousands of simultaneous users?

        I worked at a shop where we did. I've known a lot of others who didn't, but their goal was so lofty. Some actually said "millions of simultaneous....", but only ever managed to get dozens, or even only one dozen at peak times, including themselves and their friends.

        Even at the shop where we had hundreds of thousands of simultanious users, they didn't start out like that. It grew into that over several years.

        If you have that many users, you should have the financial base to hire developers who already know what they're doing, system/network admins who can do the infrastructure properly.

        As you're saying that you're to be the only dev, and you don't have the resources to hire any more, stay simple, and let it grow.

        No big site ever designed for their final need. It grows over time. Just keep an eye on your current need and growth projections.

    --
    Serious? Seriousness is well above my pay grade.
  87. Re:Java is SLOW by Anonymous Coward · · Score: 0

    Yes, the JVM kicks PHP and other scripting languages' asses.

    The misunderstanding comes from the fact that nobody writes Java programs like they write PHP scripts. Of course, you could just create a JSP file and start writing PHP-style code. It would be faster & better than PHP and actually have sane, non-buggy libraries.

    But Java programmers love their abstractions and frameworks, and some of those frameworks are slow. For simplistic use-cases, PHP will seem like it's "faster".

  88. Re:Java is SLOW by MrBandersnatch · · Score: 1

    "Here is a high performance well coded php mvc framework if you do decide to go that route: http://www.yiiframework.com/ [yiiframework.com]"

    It's 6:30am here; please don't make me laugh so hard this early in the morning.

    high performance, php in the same sentence *chuckle* *gafwah*

  89. Say No to App Engine by Anonymous Coward · · Score: 0

    As the lead dev for a major startup, I can tell you app engine is a dead-end. I inherited app engine from one of the founders who is a decent, but not amazing coder and I can tell you it is a nightmare in so many ways, mostly because of app engine itself.

    If you are getting funding and personnel like Khan Academy, I am sure App Engine can work fine for you. Otherwise, there are many problems including:

    1. Despite being "open source," Google takes forever to fix some critical issues and isn't very fast accepting any community patches. The encoding issues with multi-part forms for example are still not fixed years later. Finally a few weeks ago after years, they release a patch and the issue is still broken with post. Read the issues carefully and you will see other examples.

    2. The cost is also deceptive. Although your site will handle traffic spikes and you can cap the quota, you will probably start coding in fear of the pricing model. What I mean by that is you will make decisions not based on the best architecture, but based on avoiding excessive app engine costs. For instance, many app engine devs write blog posts about circumventing write or reads on the datastore because of cost, not performance.

    3. App Engine is a black box. Once the thing is deployed, you have no great way of debugging it or doing extensive performance analysis without largely rolling out your own toys. If something goes wrong, it is very difficult and time consuming to fix for many reasons. First, the dev environment merely simulates the production environment and many features do not work the same or at all in the dev environment. The dev environment is also single-threaded which makes life miserable when doing any sort of call/response web service interactions.

    Second, many libraries are just pitiful on app engine or there is no support. You cannot use anything that calls c-modules in python. Not a big deal maybe at first, but you end up with weird things happening sometimes in libraries you don't expect. For instance, we were having SSL errors with rackspace, and it turns out that you cannot catch the SSL exceptions from Rackspace's library since the exceptions come from a c-module.

    Thirdly, many bugs only appear in production and not dev, or vice-versa. You cannot easily patch some of the lower-level portions of the infrastructure on the server without google first fixing. The best you can do sometimes is upload your own version of a python package and hope that it lets you override the app engine settings (example webobb). Again, Google is extremely slow fixing these issues and because they own the infrastructure including the server software, you are at their mercy.

    4. Outages. Despite the notion Google = Uptime, it's simply not true. Go back and start looking at the stats. I can also tell you they lie on the global stats because many times our servers have gone down and they haven't logged any global issue even though it was more than just us. To Google's credit, in many of these circumstances they credited us with the lost time, but on the other hand a few dollars doesn't make up for hours of lost business and user activity.

    5. Performance. App Engine data store performs and queries pathetically compared to other similar alternatives such as Cassandra or even Hadoop. We've spent tons of time optimizing our site and although it works better than before, we've had to get creative in ways I don't have to with other technologies. I ask sometimes what is the point of a data store that can scale huge that I can't query, read, or write to without huge concerns. App Engine data store is not on par with Google's own secret sauce, sorry.

    6. Framework support. Unless you want to roll your own, you'll have to modify or pick from a pathetic stack of frameworks. Most good Python or Java frameworks that are good stand-alone are half-baked when it comes to their app engine versions. For example, a lot of people want to use Django and thus the natural choice seems like Django non-rel. What you will

    1. Re: Say No to App Engine by buddyglass · · Score: 1

      I bow to your experience. My recommendation was based more on the "theory" of AppEngine than the actual reality. That said, if you were asked to recommend a PaaS solution that supports Java and has the potential to scale "way up" with minimal effort, what would you suggest? Think "AppEngine done right". Others have mentioned Heroku and OpenStack; have any thoughts there? Others I'm omitting?

    2. Re: Say No to App Engine by DuckDodgers · · Score: 1

      CloudBees and OpenShift.

    3. Re:Say No to App Engine by DuckDodgers · · Score: 1

      I'm curious, have you looked at AppScale? http://appscale.cs.ucsb.edu/ It's an open source re-implementation of the Google App Engine APIs. It's not totally feature complete, but it has quite a bit and allows you to move your Google App Engine software (if you're using the subset of APIs AppScale supports) to Amazon EC2, your own servers, or other hosting services.

      It seems to me that this kind of thing is ideal, these open source toolkits that provide the facade of Amazon Cloud and Google App Engine APIs permit competitors to challenge the giants.

  90. .NET by betterprimate · · Score: 1

    .NET, of course.

    On a serious note, I would recommend looking into heroku. Not excluding all the great input already posted in this thread.

  91. Re:Java is SLOW by terjeber · · Score: 1

    Here is a ... well coded php mvc framework

    That's an oxymoron.

  92. Go to hell by Anonymous Coward · · Score: 0

    Stuff like this doesn't come free. Hire people who know what they're doing if you don't. It's insulting to people who actually know what they're doing that the OP expects to "just get a few pointers" and go off and build this sort of system. And before you get started, the OP is not asking for some advice to start learning the process of what needs to be done here. He or she just wants a solution now now now.

    How would writing majors feel if we asked them "I'd like to write a best selling novel. I've done some creative writing and I'm comfortable writing short essays, but I can't afford a ghost writer who's actually good at what they do. What should I do to get started?" It's not that simple.

    1. Re:Go to hell by crutchy · · Score: 1

      the OP is only insulting to morons like yourself who are apparently unable to make any kind of valuable contribution

      every grand idea starts small... rarely if ever is an entrepreneur able to secure external investment (loans, vc, grants, angel investment, partnership, acquisition, etc) based solely on an idea without any kind of working prototype or demo, and also without a sound business plan and demonstrated personal commitment of time and capital (those days ended in 2000 with the dot-com collapse).

      also, many professionals are happy to share a small amount of their expertise for free occasionally even if only to give their ego a little boost. consultants also usually offer a little bit of free advice to help entice potential clients or build rapport and reputation (every consultant knows that word-of-mouth is the best marketing tool)... for example many solicitors provide a no obligation 30 minute initial consultation (which i have taken advantage of myself) and many are quite happy to offer advice in those free sessions even if they are doubtful of further consultation for the particular case.

      i've posted my thoughts previously on this topic, and there are many other good contributions in this thread (particularly about database back end selection) and i wish the OP well in his (or her) endeavor.

      there will always be naysayers towards any enterprise... only those that rise to the challenge and prove naysayers wrong will succeed.

  93. use CQRS by arachnoprobe · · Score: 1

    You might want to have a look at Command Query Responsibility Segregation (CQRS) as a concept.

  94. Do we get paid? by bryan1945 · · Score: 2

    If we're doing work for you, how much do we get?

    --
    Vote monkeys into Congress. They are cheaper and more trustworthy.
    1. Re:Do we get paid? by crutchy · · Score: 1

      no harm in asking i guess :-)

    2. Re:Do we get paid? by Anonymous Coward · · Score: 0

      If we're doing work for you, how much do we get?

      Everyone else gets 1 thread full of insights, tech analysis, and snide comments... But you, you're not really working out. Please pack up your internets and go.

  95. sysadmin chiming in here by CAIMLAS · · Score: 0

    Things I've noticed as a sysadmin, for 'scalable' apps:

    * Java? Do not use it. Sorry. Just don't. It's shit and nobody seems to be able to write it worth half a damn to scale out.
    * That goes double for Java running on Tomcat.
    * That Java framework you know? It's expounded shit and will consume all available system resources with only a handful of users.
    * It doesn't matter how good a developer you are if you're a Java developer: your application will be CPU and memory intensive. (See the first three points.)
    * Whatever you do should use heavy use of in-browser processing via javascript.
    * Use MVC, or preferably, MVP concepts in development.

    Aside from a front-heavy application, your concerns will be database access. Design the database so that it will scale broadly; preferably pull someone in who knows a thing or two more about normalization than you do to do this, if only to have a second set of eyes.

    --
    ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    1. Re:sysadmin chiming in here by AleX122 · · Score: 1

      Thanks for reply.

      • * I have quite long experience with Java. Yes it is resource hungry, but only at the beginning. So simple hello world application will consume much too much memory (RAM). However after this starting point adding more and more business logic does not mean that memory consumption will grow linearly.
      • * The framework I mention - GWT is rather a toolbox that you write client and server side code in Java. There are no architectural constrains. However after compilation you get javascript client side and bytecode (java) server side. So there will be a lot of processing in the browser (this is another problem: since state of the browser can not be treated as trusted one, there must be validation of all the data, everything that is received from client, even if it is assumed to be calculated by your code. Then synchronization of states between server and client. Luckily I know this technology well enough that I know how to handle this.) Anyway I will use GWT only to generate client side. I know that probably writing client in pure javascript will produce better results, but I am not master in js and it would take me much longer and produce a lot of bugs. Maybe later while doing optimization some client code will be rewritten.
      • * GWT support MVP as well

      Yes Java and GWT are my hammers ;)

  96. do these searches by Anonymous Coward · · Score: 0

    http://www.google.com/search?q=scaling%20mysql%20slideshare%20presentation

    http://www.google.com/search?q=scaling%20webapps%20slideshare%20sharding

    http://www.google.com/search?q=scaling%20master%20slideshare%20slave%20cluster

    http://www.google.com/search?q=scaling%20slideshare%20tumblr%20flickr

  97. If you have a scaling problem, you don't have one. by Qbertino · · Score: 1

    1st Rule on scaling: If you have a scaling problem, you don't have a problem.

    Wrong approach. Yes, many have said it and I'll say it again and it will remain true for all eternity.

    If you think you've got the next Google or Facebook up your sleve - well so be it.

    Build your app, use regular common sense when doing it and the rest just happens. I've handled upwards of 20 Million active users with user tracking and billing with a few thousand hits per second per product in an internet gaming company and I can tell you that when scaling with a product has to happen - it will, and if server duplication is done with Perl magic by a handfull of admins, cloning one drive to the next using a checklist on a wiki.

    The thing you will need most when you have to scale is money. The time building the perfect scaling system from scratch from the get-go is a million times more worth if it is spent on building business contacts and getting VCs and Angels with good contacts and/or cash to invest on board. If your app isn't a total mess of spagetti code and ignores the most basic of architectual rules your better set for scaling than most large apps out there. For example: Click around Ebay for a few moments and try to imagine what's going on beind the scenes there, and think of how it grew and how and when Ebay started out. I'm currently working on a financial app for a *very* large international bank. The apps foundation is 8 year old copy-pasted & slightly modified grey goo of Dreamweaver HTML/JS and anti-object oriented PHP 4, an app so bizare it defies any description - and yet it is the key product of the shop and beats the competing Java app in terms of usability and flexibility.

    Anybody here will tell you that scaling a PHP app to a billion users won't work and you should forget PHP right away. And yet Facebook is here and they're scaling pretty well as far as I can tell. They even got a few devs working on a PHP JIT compiler (HipHop) the last few years. Again, as you see: Scaling problems are *exactly* the kind of problems you want to have.

    Bottom line:
    Make it work, make it beautifull and worry about scaling when it happens. All else is nonsense.

    P.S.:
    Premature scaling worries aside, in terms of technology today I'd go for Nginx and JavaScript in the Front and Back, using Node.js as the server-side technology. It seems stable enough to build something serious with it and you've got one PL for both server and client. It's like in the good old days of Netscape Webserver. ... My 2 cents.

    Good luck.

    --
    We suffer more in our imagination than in reality. - Seneca
  98. Focus on structure and algorithms by Anonymous Coward · · Score: 0

    Just as Linus put it. Focus on your stuctures and algorithms ("architecture") to ensure good scalability upfront.
    Then work out details (SQL queries and whatnot) for your platform/technology and try to find static vs dynamic parts and separate them properly. Static parts can be cached 100% LATER with some servers upfront which will give you almost infinite scalability there. Your basic setup should cover around 90% of your dream setup, as the last 10% are for versions 2 and 3 of your application, and these are really hard.

    Then start working and focus on clean structure and code right now than premature optimization; with a clean structure optimization can be added later. If you're doing Java code please do not generate lots of short-lived objects.
    Now you should be good to go.

    PS: I really don't understand the mind set of 99% of the answers here. They need to hand in their geek cards...

  99. Re:Java is SLOW by jma05 · · Score: 1

    > Java is slow*

    As others have noted, this is completely wrong. Java bytecode, over a JVM in server config, after the JIT warmed up (which is the case for web apps), is very close to C++ in performance.

    > Google/Youtube use python

    Only for select portions of the site that see regular updates or have little logic (uploads, the labs potions etc). For the rest, they use a lot of Java, along with C and now Go.

    Python and PHP web sites can be quite scalable. This is not because Python and PHP are themselves fast (they are 5-20 times slower than Java when run pure), but because most of the work is in fact done over C extensions (Python regex is actually a tad faster than Java regex because it uses a native module). If most of the CPU time is spent applying templates rather than any per-page logic, simply switching to a native templating engine works wonders. This is fine. I don't care where my code is run as long as I don't need to write managed code and I use Python a lot for scientific code without speed problems.

    To query a database, run an XPath query, call a scientific extension or apply a template with native cheetah template engine, Python won't be much slower than C. But logic in Python itself is SLOW and it shows in a tight loop (not something web sites do much of anyway). So Python makes it easy to write C extensions without getting hands dirty (Cython, Shedskin etc). Put in another way, Python is a slow language (implementation, really) that does not let the slowness get in the way, most of the time. Same goes for PHP.

    > * Yes, I know *in theory* in a certain very limited set of circumstances Java can be faster than compiled code, but the theory doesn't actually match the practical reality of the situation.

    Many Java web frameworks do a lot more stuff than typical PHP setups. For example, I use ZK. It maintains the entire client UI model on the server. This is definitely not meant to scale. But the abstractions save me a lot of work for what I want to do.

    Here are some recent benchmarks on how Java and PHP perform under load, especially when straight Servlets are used. No comparison.
    http://www.techempower.com/blog/2013/03/28/framework-benchmarks/
    http://www.techempower.com/blog/2013/04/05/frameworks-round-2/

    > Disclosure: I run a high traffic website that get's millions of page views a day. Uses Yii php framework

    But what does your site mostly do? Just serve static content for most part? with the pages filled with some straight-forward queried content? Yii framework looks like a basic MVC framework with few additional abstractions. It should not matter what language you use for something like this. I suspect that the same would be true for the OP. I don't disagree with your conclusions, just not for the same reasons.

    The final advice to OP: Start with whatever takes the least capital. There are frameworks that have plugins for most of the common web stuff (Rails, Grails etc) and these are the right places to get started quickly. Hundreds of thousands of visitors/day is not much for modern machines. Even if the site is 20 times slow, clustering to 20 servers is cheaper if the slower, but more productive technology saves one developer year of work. A little optimization later will probably give you a lot of mileage later. Switching to a scalable architecture, using technology built for scalability like Go, Hadoop and NoSQL/HBase won't need to be concern until one is rolling in cash. They need more expensive devs. Premature optimization and all that.

  100. Is this a business? Or a religion? by jimicus · · Score: 1

    You mentioned wanting to stick with open standards.

    I would point out that if this is ultimately to run as a business, you need to make decisions based on what's best for the business. Which may or may not be something based around open standards.

    Making a decision early on and sticking to it dogmatically even when there is no business benefit in doing so - and refusing to even contemplate alternatives simply because they're "not open" sounds dangerously close to operating a religion rather than a business.

  101. Platform to develop web apps fast by Anonymous Coward · · Score: 0

    Hi,

    My suggestion is for you to look into the Agile Platform, from OutSystems (www.outsystems.com).
    I came across it, some years ago, and once you ramp-up on it's usage, you'll be able to much more proficiently deploy web apps.
    It has a free edition, which I think that can only run in Windows. However the paid version you can choose between java ou .net stack.

    Hope you find what you're looking for,

    cheers,

    "Anonymous Coward"

  102. Wrong question by SpinyNorman · · Score: 1

    Unless you have the money and management experience to start a company, hire developers, etc, then the technology that you'd hypothetically use is irrelevant. OTOH if you do have the money to hire developers with the right skills then the problem will solve itself.

    The notion of not having the money or management skills but somehow bootstrapping yourself up from nothing is almost certainly not going to happen. Even those who did start major companies without venture capital did so by borrowing significant money from family and friends, and only succeeded because they also happened to have the management skills.

    Creating a successful start-up is more (or ALL) about having the right people than the right idea. In fact, the various start-up incubators are happy to fund the right people even if they don't have an idea, and many/most start-ups, if eventually successful, don't end up with the same company/product idea they originally started out with. You man think you know what the world needs, but the world will tell you if your right or not, and you'll only be successful if you adapt.

    If you have the right stiff to create a startup then I think your best bet is to apply to a start-up incubator that will finance you and hook you up with the right people, but they are mostly looking for teams rather than individuals.

  103. Re: Building a web apps scalable by Anonymous Coward · · Score: 0

    About an eco system to address your expectations i would suggest open-stack infra-structure with Juju, and Cloud Foundry.
    For the Java Developments i found Wavemaker a great tool that generates java standard code projects, and can be easily opened by the Eclipse IDE.
    For the web servers Tomcat or Jetty.

    Links related to this techmologies here:

    Juju for OpenStack - http://www.slideshare.net/fasgoncalves/juju-on-ubuntu-cloud
    Cloud Foundry - http://www.slideshare.net/fasgoncalves/cloud-foundry-and-openstackcloud
    WaveMaker - http://www.wavemaker.com

    Also look at Ubuntu Cloud : http://www.slideshare.net/fasgoncalves/ubuntu-cloud-infrastructures

    FranciscoG.

  104. Don't use ORacle... by Anonymous Coward · · Score: 0

    Oracle sucks balls.

  105. Don't worry by Fatty · · Score: 1

    Just some things I've learned over the years while working on high and low volume websites:

    * Spend your energy coming up with the product and figuring out your customers needs. Chances are you won't run into scaling problems until later. Your first goal is to get that far.
    * What you think will be the bottleneck when you start out will probably not be it. The ugly part is that you won't know what it is until it hits you
    * Read through some of Brad Fitzpatrick's presentations at http://danga.com/words/ (They're mostly variations on the same theme, pick one of the later ones). Yes it's 6 years old at this point, but little has changed. OK, maybe schemaless datastores. But look at what livejournal did on commodity technology.
    * Don't fall into the temptation of using sexy technology because it solves a problem you don't have yet. You can do a heck of a lot with MySQL and Postgres.
    * Your choice of technology isn't as important as your development practices. Automate your testing. Automate your deploys. Automate your testing. Stick with the languages you know.
    * Measure. Something like New Relic will help you spot your problems and fix them.

  106. Re:Java is SLOW by Anonymous Coward · · Score: 0

    MySpace v1: Java
    Facebook: PHP

    nuff said

  107. Red dwarf? by dr2chase · · Score: 1

    Not sure this is right for you, but it was once upon a time designed to match some of your buzzwords.

    http://sourceforge.net/apps/trac/reddwarf/

  108. Design with clustering in mind by fzammett · · Score: 1

    While I agree with everyone that says build small to start and scale up later as needed, the one caveat I'd give is whatever technology you use, design with the THOUGHT of clustering from the start. I've seen many designs fall down when scaled because, for example, the app used session too liberally and now session replication across clustered nodes is a serious problem.

    There's nothing that says you must use clustering later, there's other approaches, but if your app inherently can't be clustered because your design doesn't allow for it the it can turn into a real headache quickly.

    If you design as statelessly as possible then you're likely to be fine when you go to scale, whether vertically or horizontally. That's a simplistic answer, but it will be correct, to a reasonable approximation of "correct" :)

    --
    If a pion (n-) collides with a proton in the woods & noone is there to hear it, does lamdba decay into the source pa
  109. Re:HAHA Ha ha ha haha. by Anonymous Coward · · Score: 1

    Can you use Google to learn how not to be a fucking douchebag?

  110. Give PERL a shot. by Anonymous Coward · · Score: 0

    It's a proven technology that scales very well, and is very easy to learn and maintain. Even Indian H-1B programmers like it. Why do you think the U.S. government runs smoother than a baby's butt?

    PERL. Learn it. Love it. Live it.

  111. Re:Start smaller, add metrics early by davecb · · Score: 1

    Definitely start small, but make sure you measure the response time and load from the beginning, as close to the user as possible.

    The load will tell you how many users you're gaining, albeit in computer terms, and the response time will tell you if the system is starting to annoy people by slowing down. If you plot RT against load, you'll get the curve you need for capacity planning when and if the program becomes popular. report

    --dave

    --
    davecb@spamcop.net
  112. Re:HAHA Ha ha ha haha. by jasonlfunk · · Score: 1

    An 18 year old in a 30 year old body. Grow up.

  113. Re:Java is SLOW by Anonymous Coward · · Score: 0

    Yes, I know *in theory* in a certain very limited set of circumstances Java can be faster than compiled code, but the theory doesn't actually match the practical reality of the situation.

    In my experience, Java has basically two areas where performance is an issue: initial startup, and certain graphics operations. Once Java is running, and so long as you're not doing any sort of 3D graphics or such, Java is as fast as anything, including compiled C code.

    Neither of those apply to web apps. You'll only have to bring your server up once in a blue moon, and you'll just be answering HTTP requests, not driving a FPS. Java KILLS at that kind of stuff.

    And yes, lots of huge websites (Amazon, Ebay, Facebook) use it.

  114. ZX Spectrum by javawocky · · Score: 1

    This reminds me of the old days when I was going to write the next great game on the ZX Spectrum. You had to first make the coolest intro screen before writing an inch of code, that's normally all you got out the door.

  115. App Engine + Django! by speedplane · · Score: 1

    I'm a big fan of using Django on top of Google App Engine. You get Google App Engine's awesome infrastructure as well as Django's standard interface and community.

    --
    Fast Federal Court and I.T.C. updates
  116. Non Functional Testing anyone by Big+Hairy+Ian · · Score: 1

    Whilst there's lots of good advice about scaling out hardware & infrastructure on here no one has mentioned Performance Testing. Which I would have to say is probably going to be vital for a web app. Also don't forget to security test it properly too or you'll do a Linkedin on us.

    --

    Build a Man a Fire, and He'll Be Warm for a Day. Set a Man on Fire, and He'll Be Warm for the Rest of His Life.

  117. killer app and competition by Xylene2301 · · Score: 1

    Remember also that once you release your killer app and get even a handful of clients, someone out there is going to copy what you've done and one-up you in some way. Then you're going to find out how competitive the software industry really is. You might consider registering a provisional patent first. I think that gives you a year to file an actual application. It's cheap insurance.

  118. Couchdb and chicagoboss by thanosv · · Score: 1

    I have two sites one using Couchdb and the other chicagoboss+mongodb both scale well over for 500K users so far. The best part of it is that they both run on very cheap. chicagoboss on a bunch of m1.small spot instances and mongodb on a reserved m1.large While couchdb a group of spot instances and reserved m1.smalls. Erlang based web apps really scale well. http://www.ostinelli.net/a-comparison-between-misultin-mochiweb-cowboy-nodejs-and-tornadoweb/