New York City Has a Y2K-Like Problem, and It Doesn't Want You To Know About It (nytimes.com)
On April 6, something known as the GPS rollover, a cousin to the dreaded Y2K bug, mostly came and went, as businesses and government agencies around the world heeded warnings and made software or hardware updates in advance. But in New York, something went wrong -- and city officials seem to not want anyone to know. [Editor's note: the link may be paywalled; alternative source] New submitter RAYinNYC shares a report: At 7:59 p.m. E.D.T. on Saturday, the New York City Wireless Network, or NYCWiN, went dark, waylaying numerous city tasks and functions, including the collection and transmission of information from some Police Department license plate readers. The shutdown also interrupted the ability of the Department of Transportation to program traffic lights, and prevented agencies such as the sanitation and parks departments from staying connected with far-flung offices and work sites. The culprit was a long-anticipated calendar reset of the centralized Global Positioning System, which connects to devices and computer networks around the world. There has been no public disclosure that NYCWiN, a $500 million network built for the city by Northrop Grumman, was offline and remains so, even as workers are trying to restore it.
City officials tried to play down the shutdown when first asked about it on Monday, speaking of it as if it were a routine maintenance issue. "The city is in the process of upgrading some components of our private wireless network," Stephanie Raphael, a spokeswoman for the Department of Information Technology and Telecommunications, said in an email on Monday. She referred to the glitch as a "brief software installation period." By Tuesday, the agency acknowledged the network shutdown, but said in an emailed statement that "no critical public safety systems are affected." Ms. Raphael admitted that technicians have been unable to get the network back up and running, adding, "We're working overtime to update the network and bring all of it back online." The problem has raised questions about whether the city had taken appropriate measures to prepare the network for the GPS rollover.
City officials tried to play down the shutdown when first asked about it on Monday, speaking of it as if it were a routine maintenance issue. "The city is in the process of upgrading some components of our private wireless network," Stephanie Raphael, a spokeswoman for the Department of Information Technology and Telecommunications, said in an email on Monday. She referred to the glitch as a "brief software installation period." By Tuesday, the agency acknowledged the network shutdown, but said in an emailed statement that "no critical public safety systems are affected." Ms. Raphael admitted that technicians have been unable to get the network back up and running, adding, "We're working overtime to update the network and bring all of it back online." The problem has raised questions about whether the city had taken appropriate measures to prepare the network for the GPS rollover.
NYC Maintenance budget. You would assume they would switch out 5% of everything a year for a 20 year refresh cycle, but No.
well, it's hard to compete with the bright lights and excitement of Sheboygan, but New Yorkers seem to make do.
Maybe it went something like this?
Cops and parking agents can still ticket -- they'll just have to type in a plate # manually to "run" it.
>> whether the city had taken appropriate measures to prepare the network for the GPS rollover
NYC is paying $37M a year for a managed service from Northrop Grumman. (Remember when they used to build airplanes?) Under any reasonable standard, the city did "prepare the network" - they paid NG to make it NG's problem to solve.
The forthcoming lawsuit/clawback should be fun to watch, though.
to judge others by where they live or want to live is idiotic. I get tired of them trying to dictate to me, but all I want them to do is succeed from the nation. Again though, unlike them. Or you. Who am I to tell them where to live or how to live.
Anonymous comments are as pathetic as the anonymous "sources" that contaminate gutless journalism from the New York Time
They pay 45 million a year in support for that network to Northrop Grumman. GPS being the root of that downtime should have been easily fixable. The GPS epoch that ended was the second one since it's origin in 1980. It was entirely predictable down to day dates minutes and they had 20 years to prepare for it. Hell they even have 20 or so or more satellites with atomic clocks whose sole purpose for being built is calculating the time.
I don't see how they are supporting their claim of the City trying to keep people from knowing about this. Just because the government isn't jumping up and down declaring "we failed!" doesn't mean they are actively trying to oppress people from reaching that conclusion.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
... should have been easily fixable...
Yes, should. I imagine the reason it isn't simple is because some of the platforms are out of support, others are locked down and the vendor has no interest in fixing them, some require a development tool that only runs on Windows 95, and they lost the source code for the rest. Then there's the ones they probably bricked while updating.
"The problem has raised questions about whether the city had taken appropriate measures to prepare the network for the GPS rollover." I would say it raised answers not questions. The question, did NYC prepare for the GPS rollover, was answered a resounding and emphatic NO, they did not even try to prepare.
There's a bit of confusion here I think... There are tons of IoT devices, which may or may not have GPS problems of their own, but isn't the main issue just the network being down? Once that's back up, what remains of individual firmware rollouts for various types of devices can be addressed.
I'm assuming the NYCWiN has some sort of mesh topology? Maybe with scrambled GPS data it's now trying to reconnect to devices far away from where it actually is and is this failing to converge any routes? I'm a bit curious on the tech details here. Hope we learn more.
Hire a Linux system administrator, systems engineer,
> waylaying [...] the collection and transmission of information from some Police Department license plate readers
good.
man, I feel like mold.
>while you can't even write complete sentences and spell simple words like 'secede'
OP Anon didn't write that, Pgmrdlm did. Any moron could see that.
I didn't say this was absurd to have downtime. This isn't a planned outage for maintenance either. To be clear this isn't 1% down time. This issue began on Sunday the 6th, and the network has been down for at least nine or ten days. At this point the network has been down for about 3.6 % of the year, and that percentage is increasing.
What I said quite clearly, they have known (northrop grumman) about the epoch changing for 20 years. This shouldn't be a surprise. They had they designed the network properly would have been aware of this absolute unavoidable reality and been able to pre-emptively planned for and fixed the underlying causes.
They clearly did not, so the question becomes what exactly do they do for the 40 million dollar contract, if not maintain what they built and marketed as a safe alternative and reliable and viable critical information network.
Sorry my math was confused. But my larger point stands;\ that ten days of a critical response network is not a prepared response. If it was possible that this was going to occur they should have spent months prior to alerting all parties and this news cycle would have been easily avoidable.
Clearly, The Machine is battling Samaritan again.
Laws are rules for the court, but merely a bottom bar to hit for life. Think beyond laws in your actions always.
I live in NYC, and have my whole life. It mostly sucks. Giuliani fixed a lot of it, but going downhill to the 70s again, fast
-- 73 de KG2V For the Children - RKBA! "You are what you do when it counts" - the Masso
Maybe he just wants the nation to succeed? I mean, an admirable goal I guess?
Northrop Grumman designed and specced out the hardware. If they designed a network with a known death date due to a proprietary GPS it was a poor design. Even if it was an accident that they messed up and it was a human error in design, they still had years to research a replacement part to keep it viable, when they went and looked at the hardware in advance of the epoch change (what they get paid to do). This looks to me like they were caught flat footed (northrop grumman) with the failure to design/build/plan for the future and or a failure to maintain what they were paid to support operations viability and up-time of a network.
A big problem with GPS is so many 3rd party hw vendors made it proprietary-dirt-simple and "upgrading" to a new signal means gutting the existing logic, sometimes entirely. It's not a trivial upgrade whatsoever.
And yet every cheapie android phone out of China managed it just fine. Since the event's timing has been known down to the second since the GPS system came in to existence, most devices didn't even need an update, they left the factory ready for the event. It's not like it's a whole new protocol, it's actually the same protocol just with a counter rolled past zero.
Giuliani fixed nothing, you know nothing about it, and I doubt you've ever lived there on that basis.
No doubt everything you say is true. But, you stated that this should be easily fixable. I simply theorized that it isn't quite so easy to fix (or they would have done it), because of those poor decisions.
A good question here is: Should vendors be held to a standard of responsibility or is it the RFP writer's responsibility to ask for these things? If we decide to hold Northrop Grumman to a standard of responsibility, what is that standard and how is it enforced? What happens if all of the potential vendors turn out to be just as irresponsible?
Heh. If its the 2.5Ghz spectrum cellular radio network I think it is, the base stations all use GPS-referenced 10MHz source oscillators to feed the radio stacks. If they did not choose the reference clock hardware wisely then I could see why its taking so long to get things back: they have to touch every base station.
It seems that if they had been at all pro-active they could have spent those months making the disruption not happen at all. It's not like every device has to wait until the rollover to be patched. They could have fixed this 5 years ago and not even skipped a beat when the rollover happened.
When someone is paying you half a BILLION dollars, you are responsible for vetting the selected hardware against that sort of problem. When you're buying millions in hardware, you get to put things like that in the contract.
Given NG's line of business, none of this should have been new to them.
Northrop Grumman has an $299 unit fix.
When you're buying millions in hardware, you get to put things like that in the contract.
You get to, but you don't necessarily have to.
It will be interesting to find out exactly what was in the contract. That will come out in the lawsuit, if it's not somewhere publicly posted already.
Velociraptor = Distiraptor / Timeraptor
Since NG is also getting $45 million/year to maintain the network, it would be on them to put that in their contracts with any hardware bendors they worked with.
It will be interesting to find out exactly what was in the contract. That will come out in the lawsuit,
You're assuming NG didn't throw in a binding arbitration clause together with their general Disclaimer of Warranty and Force Majeure covering unexpected situations such as a GPS Rollover as part of the client onboarding.
Wow, that's amazing - you solved the problem by simply reading msmash's summary - you are like a technical savant. /sarcasm
The issue isn't the GPS hardware, it's the signal the satellites are sending and the software that is processing those signals.
Ken
The underlying network should be pretty agnostic. Devices connected to the networks probably need timing to a reasonable degree of precision for things to stay in sync, and when your GPS receiver decides to party like it's 1999, you lose that precision.
Having just reviewed several GPS receivers from different brands over the last month or so to ensure that my workplace wouldn't have such problems (most are fine and dandy, one particularly ancient one needs replacing in the coming months - interestingly, that one's from a brand that markets heavily to emergency services) I'm wondering just how decentralized the network is.
If they've got a limited number of timing hosts with GPS receivers and NTP out, and all the devices on the network are synced to that, they need to acquire and swap in a limited number of new ones.
On the other hand, if it's really decentralized, and almost every device on the network has its own GPS receiver... pass the popcorn, wouldja?
Village idiot in some extremely smart villages.
It's the directed government answer to all media inquiries. And let's be honest: getting ambushed on the way into a building is a hit job, not journalism.
The issue isn't the GPS hardware, it's the signal the satellites are sending and the software that is processing those signals.
The satellites send a week number from 0 to 1023, which is a range of slightly less than 20 years. The GPS hardware needs to turn this into a date. You could build GPS hardware with firmware that assumes it will never be used _before_ the hardware was built, so the date must be between (starting date built into firmware) to (same + 1023 weeks).
This is fine if you assume your hardware breaks down within 20 years. If you assume it lasts longer, then the firmware must be updateable, so the start date can be updated every 19 years.
Uh, this is the second GPS date rollover since its inception in 1980. The first was in 1999, after 1024 weeks of operation.
There is no excuse for any device released or updated after 1999 to not account for this GPS glitch.
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
Actually the firmware doesn't need to upgradable, you just need to be able to set and save the current date occasionally to non volatile memory. The device could do this itself periodically after synchronising itself. Yes, there is a small risk of spoofed GPS signals causing an anomalous date to be saved but even there the risk can be minimised. Time more that x seconds since last recording, more that y seconds of continuously synchronised signals. For mobile GPS devices more that z km's of travel as well while switched on.
A running system shouldn't fail across a GPS epoch event. Given this is not the first epoch event all of the equipment should have been able to handle this. This can be simulated in QA testing prior to releasing the firmware image.
"The problem has raised questions about whether the city had taken appropriate measures to prepare the network for the GPS rollover."
I'm no rocket scientist but seeing as how they're having massive problems due to the rollover I'd have to say no, they didn't.
Just cruising through this digital world at 33 1/3 rpm...
they have known (northrop grumman) about the epoch changing for 20 years. This shouldn't be a surprise.
It was no surprise. The people that were there way back when knew perfectly well about the problem, and they also knew perfectly well that they wouldn't be around to be blamed for it in 20 years.
It was easy for them at the time to make the decision to "acknowledge the problem" and to quietly pretend that someone else would fix it later.
All those people are long gone, and the people that came after them just kept kicking the can down the road until they ran out of road.
Just cruising through this digital world at 33 1/3 rpm...
Everyone making GPS devices should have known about this but now the Tom-Tom in my car thinks its noon once it gets a GPS lock.
According to their site, there's an update available.
I got an email from Garmin and did the required update and no problems.
It will be interesting to find out exactly what was in the contract. That will come out in the lawsuit,
You're assuming NG didn't throw in a binding arbitration clause together with their general Disclaimer of Warranty and Force Majeure covering unexpected situations such as a GPS Rollover as part of the client onboarding.
Force Majeure is in no way applicable to a man created event that has an 100% probability and known timeframe down to the second - 20 years in advance.
they have known (northrop grumman) about the epoch changing for 20 years. This shouldn't be a surprise.
It was no surprise. The people that were there way back when knew perfectly well about the problem, and they also knew perfectly well that they wouldn't be around to be blamed for it in 20 years.
It was easy for them at the time to make the decision to "acknowledge the problem" and to quietly pretend that someone else would fix it later.
All those people are long gone, and the people that came after them just kept kicking the can down the road until they ran out of road.
Some of us realized we would still be around when the clock ran out. I watched others deal with DEC's date-75 problem. I fixed the Town of Hudson's Y2K problems in 1998, and later reported a Y2K bug in DEC's software. The next problem will come when the Unix 32-bit seconds counter overflows in 2038. There are people who are concerned with such things, but we don't get much attention until the crisis is upon us and it is too late for any of the easy solutions.
This gives me lots of faith that we will have 0 problems in 2038 with the epoch rollover.
The next problem will come when the Unix 32-bit seconds counter overflows in 2038.
I can't wait to see the general havoc this causes, assuming I'm still alive to enjoy it.
Just cruising through this digital world at 33 1/3 rpm...
And yet every cheapie android phone out of China managed it just fine.
Mobile phones do not critically rely on GPS time for any function. Now the infrastructure behind them on the other hand critically require precise time sources for data synchronisation.
Your argument is like saying I survived just fine in the last blackout without a generator, why would hospital need one.
Don't make those arguments. They are anti-intellectual. And shame on the people who modded you up.
Technically, the 32-bit signed integer that holds the Unix seconds value overflows and goes negative! It does not overflow to zero. Which means some systems will jump back in time to 1970 - 68 = 1902.
Note that Y2038 failures will start to manifest themselves when attempts are made to set timers past the overflow date. Therefore, failures will start to appear before the actual overflow date.
Any system calls, protocols and file systems that use 32-bit signed absolute time fields will also be impacted. Therefore, 64-bit systems are not immune to Y2038 due to inter-operability support for other systems and non-compliant 32-bit applications.
In particular, the automobile industry is going to be at high risk of hitting Y2038 failures as vehicles built today may still be on the road in 2038.
I note that Linux v5.0 introduced some Y2038 compliant fixes which removed get_timeofday(). This promptly broke a 3rd party kernel loadable module that relied on the now non-existent get_timeofday(). Therefore the Y2038 havoc has already started...
Also it is not possible to set a smartphone past the year 2036. So currently Y2038 on smartphones has been "fixed" by not allowing the smartphone to get too close to the year 2038!
Actually, why would the communications network even NEED GPS to function at all?