Slashdot Mirror


Passport Database Outage Leaves Thousands Stranded

linuxwrangler (582055) writes Job interviews missed, work and wedding plans disrupted, children unable to fly home with their adoptive parents. All this disruption is due to a outage involving the passport and visa processing database at the U.S. State Department. The problems have been ongoing since July 19 and the best estimate for repair is "soon." The system "crashed shortly after maintenance."

20 of 162 comments (clear)

  1. Change management fail by dave562 · · Score: 4, Insightful

    Rollback plan? What is that?

    1. Re:Change management fail by roc97007 · · Score: 5, Informative

      It's the wave of the future. A typical contract with offshore IT is for "current minus one", which means that each new firmware, OS or driver release causes a flurry of "maintenance" by remote "admins" who follow written procedures to update the systems with no real understanding of what they're doing, in what order they should do it, or what to do if something goes wrong. A typical list of systems to update may randomly contain a haphazard collection of prod and development machines, and may include some but not all members of a cluster. Systems are patched in Asset Management order, with no thought to rolling through dev and QA first before doing prod.

      The backout plan is to engage the vendor.

      Our outsourced IT bricks a few servers a year. We try to take it in stride. We've argued hysterically that if they really have to do firmware updates, to at least do dev servers first for God's Sake. They seem to not understand this.

      So yeah, I could definitely see this happening. We will be seeing more of same. You get what you pay for.

      --
      Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
    2. Re:Change management fail by Anonymous Coward · · Score: 4, Insightful

      Sounds like your IT has been outsourced to India, who as a culture, literally does not know how to say "no". The answer is always "yes" or some other affirmative that makes you think they have it under control and can do the work. When the fact is, the work they just said "yes" to, they don't actually have a clue how to perform it, so they learn as they go, on your production servers. They don't know what development / test environments are.

    3. Re:Change management fail by roc97007 · · Score: 3, Informative

      Oh man, don't get me started. It's not even clear that one would need to pay more -- we have not saved money so far by outsourcing, although the outsource company keeps telling us that savings are just around the corner. The first year, the excuse was that there is always startup issues, the second year, the excuse was that the outgoing employees did not document their jobs well enough, (probably true -- who would?) the third year the excuse was that the scope was bigger than we said it was. And so forth. Each year a new excuse and each year the total cost is more than what we were paying when we had our own IT department.

      So yeah, insourcing, or at least selective insourcing, (let them keep doing what they do well, if anything) makes tremendous sense to me.

      But I don't make the decisions.

      And even where upper management has considered terminating our outsourcing contracts, it's only to give the contract to a different outsourcing company, which only means we're now calling a building across the street from the original building in Hyderabad. Who knows, we might even be dealing with some of the same people.

      --
      Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
    4. Re:Change management fail by ShanghaiBill · · Score: 4, Informative

      Sounds like your IT has been outsourced to India, who as a culture, literally does not know how to say "no".

      It takes two to fail to communicate. You should not be asking questions that require a direct "yes or no" answer. In many cultures, that is considered rude.

    5. Re:Change management fail by dave562 · · Score: 3, Interesting

      As much as I am not a fan of government regulation, my professional experience has shown me that the only time people get IT anywhere close to right is when there is a risk of financial penalty involved in getting it wrong. Regulation seems to be the only solution to people working for peanuts. The people who work for peanuts make mistakes. If those mistakes cost the company more than the company saves by hiring those people, they will not hire those people.

      Out of all of the industries that I have worked with, the financial services industries seem to be the most together. They are not perfect, but the penalties associated with losing customer data makes them more careful.

    6. Re:Change management fail by ShanghaiBill · · Score: 5, Insightful

      Sorry, what part of paying you to do a job requires me to give a shit about whether or not your failed third-world culture doesn't like answering direct fucking questions?

      The part about you paying them far less than you would pay someone culturally compatible. If you want to pay peanuts, you need to deal with the cultural consequences. I have dealt with Indians for years, and have learned how to ask questions so that I get the answer I am looking for. It is not that hard.

    7. Re:Change management fail by Areyoukiddingme · · Score: 4, Funny

      Well there's your problem! God has no part in an IT management plan.

      Yeah, the other guy has it well in hand.

    8. Re:Change management fail by roc97007 · · Score: 3, Funny

      I sometimes think that if I accidentally entered a church with an IT management plan in a back pocket, my pants would burst into flames.

      --
      Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
    9. Re:Change management fail by l0n3s0m3phr34k · · Score: 3, Interesting

      This isn't always the case. A company can save money via outsourcing IT infrustructure if they go with the right vendor. VMWare, virtual servers, proper fail-overs, big multi-core blade racks were the VM is still more powerful than your original server and still costs less...but of course I work at HP in the Enterprise Services so the level I'm talking about probably isn't affordable for a "small company". We have VERY specific steps for everything, our "runbooks" detail everything from server configs, hardware, rack enclosures, port layouts, and responsible parties to contact for each part if it fails. When you have a rack of blades, it's far easier to snapshot, launch then test,,,we always have a "backup" in a hot image ready to go if anything fails. Often I'm working with 5-15 people spread across the globe all doing different functions (unix admin, wintel, recovery, netops, etc) but we rarely have any "HP owned" customer-impacting outages. Of course my major clients are airlines so it's all tightly regulated; your individual milage may vary LOL.

    10. Re:Change management fail by khchung · · Score: 4, Insightful

      Sounds like your IT has been outsourced to India, who as a culture, literally does not know how to say "no".

      On the other hand, I have encountered plenty of managers who literally do not know how to take "no" as an answer.

      Takes two to make a pair.

      --
      Oliver.
    11. Re:Change management fail by ShanghaiBill · · Score: 5, Interesting

      simple yes or no questions

      It is only simple because you speak English. You need to widen your cultural perspectives. In other languages, and other cultures, it is not so simple. For instance, Chinese does not even have the words "yes" and "no". If you ask a Chinese speaker if they have a pen, they will answer "have" or "not have". If you ask them if they are going to lunch, they will answer "going" or "not going". There is no such thing as a "yes or no question" in Chinese, and culturally, Chinese are much more direct than Indians or Japanese.

    12. Re:Change management fail by ruir · · Score: 5, Insightful

      Well, it also affects my ability to do the job people lying to me or choosing to reply with half truths to save face. My culture considers that extremely rude too. The rules of engagement have to change in a multicultural world, and if I am the customer, their obligation to bend somewhat their culture is a ball on their side. Or I may take my business and wallet elsewhere.

    13. Re:Change management fail by Jesrad · · Score: 3, Insightful

      Sounds like your IT has been outsourced to India

      Not necessarily. I've seen this exact kind of madness happen just as easily with locals, here in France. Like that time the local, on-site support team from our vendor rebooted the production server instead of the test platform, because woops wrong terminal window in the foreground.

      Or when they covertly rolled out a "shame-bug fix" remotely on the production platform during a week-end night, again instead of targetting the test platform, then noticed their mistake, and wiped-out months of production data by reverting to a long-expired backup.

      Or when the local datacenter people managed to botch our fully-automatized install+deploy+configure solution by messing up on the one thing they had to do right - that is, upload it and launch it on the correct machine of the cluster.

      Don't think hiring local people for more money protects you from such cringe-worthy nonsense. The moment you outsource anything, and I do mean *anything*, no matter how far and how expensive and what nationality: if you base your expectations on anything but an actual track-record of reliability and dependability, you're exposing yourself to long hours of hair-pulling and yelling into phones.

      --
      Maybe we deserve this world ?
  2. The solution by Loopy · · Score: 4, Funny

    Sic the healthcare.gov guys on it. I'm sure it'll be right as rain in no time.

  3. Replication != Backup by Anonymous Coward · · Score: 5, Interesting

    From their Q&A:

    Q: Why wasn’t there a back-up server?
    Back-up capability and redundancy are built into the system. The upgrade affected our current processing capability, in part because it interfered with the smooth interoperability of redundant nodes.

    We don't need backups, the data is replicated, we're cool.

    1. Re:Replication != Backup by sumdumass · · Score: 4, Funny

      http://xkcd.com/327/

      I wonder if little bobby tables applied for a passport while this was going on?

  4. Ask the NSA by Daemonik · · Score: 4, Funny

    I'm sure they have full copies of all the data already.

  5. Large Databases? by TechyImmigrant · · Score: 4, Insightful

    The article tries to wow us with the hugeness of the database, like this is a reason for the issues.

    Yet the numbers quoted are not that big. Any modern PC isn't going to get too upset handling 75 million things. A real data center is going to sit there wondering what to do with the remaining 500TB of storage.

    I don't doubt that there is some horrible flaw in the way the system was conceived that rendered it fragile, but whatever it is, it's nothing to do with the enormity of the problem, because it isn't very enormous.

    --
    I should use this sig to advertise my book ISBN-13 : 978-1501515132.
  6. Re:Here's the problem... by AJWM · · Score: 4, Funny

    Or worse, they're running SQLServer on Sun boxes...

    --
    -- Alastair