Accessing One's Own Metadata
skegg writes: Frustrated journalist Ben Grubb has documented his attempts at gaining access to his own metadata from his carrier. "After more than a year of phone calls and emails and a private mediation session, it still hasn't released the information or answered my one key question satisfactorily: the government can access my Telstra metadata, so why can't I?" Later, he says, "Telstra's one and only valid argument to date has been that identifying who calls me would be in breach of that person's privacy if they called from an unlisted number. I've agreed and said that in providing me with my metadata they should remove unlisted numbers. They argue this would be too difficult to do, which I think is baloney."
Don't you realize they'd have to re-lubricate the DB2 indexes with heavier oil to fulfill your request? Do you have any idea how hard this is? I just love it when normal people think data like this can be magically retrieved.
But their is no way they "can't figure it out"
excitingthingstodo.blogspot.com
If the government already has your meta data, request the government to provide you a copy. At no time should a government have any information about you that you cannot fully review.
If only we could fall into a woman's arms without falling into her hands
at least in the US. Ask your medical provider for a full copy of your medical records. Most will balk, some will flat out deny your request citing HIPPA (even though it's your data), and other "misguided" reasons for not complying. One place even wanted to charge me. Capitalism in medicine is bad enough...
Yeah, "should".
I have to conclude from the supposed difficulty that they store the metadata without noting which numbers are unlisted. Or more correctly, were unlisted at the time, since that status may change.
They argue this would be too difficult to do, which I think is baloney.
I think what they probably mean is, it'd be difficult for them to be able to provide this kind of metadata without risking legal/PR trouble. To make sure that they could provide your metadata without revealing information that could possibly open themselves to criminal prosecution or civil suits would require that they pay lawyers to review the whole process. And then they'd need to spend a lot of time internally figuring out whether they want to spin the whole thing for PR purposes, or if seeing your metadata is too scary the be released at all without a PR nightmare.
And that's a bunch of work to satisfy one reporter. Doing that opens to floodgates for them to have everyone request it. So now, they have to review their entire data collection policy and create policies for who can get access to what. That's a lot of work.
I'm not saying they're right to provide access to customer data to the government while denying customers access to their own data. I'm just suggesting that they're probably not lying when they say it's difficult. You just have to know what they mean by "difficult".
I work for an aerospace company managing our telecommunications, our accounts with Telstra were used as an attack vector against us, they socially engineered an operator in order to get passwords reset as part of a more sophisticated attack (Thanks for phasing out client digital certificates by the way, good going there). Despite having senior regional managers grovel in apology (or whatever they call themselves these days, hierarchies and titles seem to change every 6 months at Telstra) we were told that their systems simply do not posses high fidelity logging capabilities after-the-fact to provide enough meaningful information to the authorities, that is it say, if they do not enable it specifically beforehand. Not that the authorities care, investigations into instances of 'cybercrime' in Australia are a joke. This particular instance in no way surprises me. This is also the same company that will send over a squadron of trenching vehicles ready to lay 4km of copper when all we needed was a simple phone line attached at an exchange given the cable was already laid a year beforehand.
Telstra: You're all the same to us whether you have a $50 a month mobile contract or a million plus dollar annual spend.
* Anonymous because I still have to work with these people everyday...
The reason, and I think they should just flat out say it because I think it's valid:
If they allow this guy to get it, then hundreds of thousands of other people will request it as well. They will need to build departments, processes, training, security procedures and create for themselves and very expensive endless quagmire of bureaucracy. Even if he offers to pay for it, someone will eventually sue, somewhere in the world and get it legally defined as a "Right" so then no-one will have to pay. It's Pandoras box, they know it, he knows it, and they are certainly not going to hand him the key.
Corporations are their own worst enemies at times. Just explain this and explain "We don't want to give it to the government either!! But they're making us!" If they're ordered by a court to release the information, they the court has to deal with most of the legal pitfalls. If the wrong information gets into the wrong hands, that's the courts fault. There's no way they are going to volunteer this.
They have the data, but there's a spider the size of a pig blocking access to the drive.
"Telstra's one and only valid argument to date has been that identifying who calls me would be in breach of that person's privacy if they called from an unlisted number.
Are anonymous phones calls really protected by law?
I mean is there a law that specifically protects the anonymity of people calling from unlisted numbers?
After all, the person holding the unlisted number placed the call.
Do people coming into your house from the street have a legal expectation of anonymity? Does someone getting into your car have a legal expectation of anonymity?
Why would someone calling your phone have a legal expectation of anonymity?
I suspect it has more to do with corporations that robo-call wanting to hide. It's profitable for the phone companies.
When you become a senior citizen, you will begin to receive endless solicitations for medical alert bracelets, "free product" scams, health insurance and so on. I suppose everyone gets some version of this crap. None of these are allowed under the "Do Not Call" act, but the callers always have unlisted numbers and do not reveal their companies' actual names in the calls.
I wanted my data ever since I've heard the first time about the Data Retention Directive (now longer in force since earlier this year, GOOD).
Mind you, they don't keep only the metadata for you calls but also a lot of "control plane"/out of band communication mobile-network. Apart from this being extremely interesting for law enforcement it's interesting for me too! That is the location part of the data.
Yes, I know I could keep a diary or keep a GPS logger with me but that needs a lot of extra effort - even for the most automated solutions (charging, downloading, etc - mind you this was well before smartphones, probably today you could do this much easily, especially if you are plugging your phone into a charger each time you step into a room...).
Anyway the point is that I've never got the data. Even if I would be willing to pay for it, every 6-24 months (that's the retention interval that was in the law).
Asking to see your metadata because there is a law that provides you a circumstance to do so.
get a life. Actually, I don't know how he got to this point. Any self respecting CSR would cheerfully tell you - Sure we'll give it to you just get a warrant. Click.
The unlisted aspect only comes through the SS7(PTSN) or SIP(VOIP/IMS) protocol headers with a flag indicating whether the account is private, in addition to phone number paying for call, phone number to display, phone number originating, etc... -- AND -- this meta-data can change during a call if it was rerouted mid stream, delayed headers, etc. This gets even more complicated for reverse billed numbers (800) where the originating number is XXX, the billing number is YYY, the display number is ZZZ, and sometimes an interlink number ends up in there. (and as we found out last month with our call logs, some numbers have yet another header that contains virtualized/multi-ring which need to be taken into account; lest the "wrong" number be displayed)
Now, legally, we are required to keep the originating number, time stamp, and length of call;
And for billing and interconnect agreements, the billing number as well.
As we internally always have full access to the raw protocol data on the Enigeering side; the legal siphon (done at the switch level) just skims off all the legally required data and stores it in long-term storage (not DB); to handle the GBs of data a day of the minimally required data.
We then have a separate process which takes each session and generates a [display-phone number, timestamp] DB for 90 days of call logs for users to look up (or legal requirement on bills for chargable calls made depending on juristdiction).
Under no circumstances have we ever kept the "is unlisted" status of the call; as it's never been a datum required for any business logic, ever.
And when handling millions of calls daily, and relying on switches to read/dump data for secondary systems to process RT is a space and time sensitive process; and thus, only the absolute minimum required is kept to prevent buffer overruns in the data processing phase;
But, as the process is semi-manual to retrieve data for a given time-range I can understand their request to honor "all my metadata" as well.
Limited time-ranges as required by law enforcement is easier to obtain:
- fetch the raw hourly dump files for the time range requested
- run the script that goes through the files and formats a CSV output for any matches of the search phone number
- this process takes hours to run for a weeks worth of data as it churns through TBs of text files if it's outside the 90-day "fresh" window that is stored in a more processed state (but not kept as it's a lot of data to store for no company benefit); most requests from law enforcement only request the last 30days of calls; and this particular process is more streamlined.
- it would be entirely unrealistic to do for the lifetime of a given customer.
One point to take away from this, is that many telecom companies have no interest to keep your data. It's expensive, each item of data adds substantial more costs, overhead, and resource to manage it's storage. It also adds significant more liability as now more people have access to it internally; and safeguards and resources must be used to manage it. Which is why the legal information is done automatically at the switching level, and dumped in a non-processed state; processed and stored, and intentionally kept difficult to access. Because we do not want the liability that comes with storing it, or making it easily available to even a subset of internal employees. Each person that has access adds more risk.
Storing users meta data at least in the telecom world -- is not wanted in the slightest, and we only do the absolute minimum to meet government regulations. Sadly, this also implies that with the current state of laws; that the data is not easily accessible, nor is the data in a state that can be released to a private indiviual without substantial legal risk.
Oh, the person with the unlisted number has called me. If they did it purposefully, I see no reason they have any standing to hide behind an unlisted number. My privacy is as valuable as theirs. If they've pocket-dialed, tough luck. I'm still at the receiving end of the call.
Moreover, unlisted numbers aren't 128 bit hashes that noone has time to enumerate. It's not as if I can't call an unlisted number. Heck, it's easy to corral the unlisted numbers, since they are disjoint from the listed numbers. Start with a set that spans the range of valid 7-digit phone numbers in a given area code. Then remove the listed numbers. Then remove the numbers that get connection errors. You are left with unlisted numbers. Such scans, in the day and age of VOIP, are rather easy to do.
And, finally, many digital connection providers pass an unlisted number along with merely a bit indicating the the number is not to be presented at the terminal. So the information is there, and it doesn't take but an asterisk setup to leverage that.
So yeah, they telco is just stalling here, nothing new... :(
A successful API design takes a mixture of software design and pedagogy.
Tell-all telephone
Green party politician Malte Spitz sued to have German telecoms giant Deutsche Telekom hand over six months of his phone data that he then made available to ZEIT ONLINE. We combined this geolocation data with information relating to his life as a politician, such as Twitter feeds, blog entries and websites, all of which is all freely available on the internet.
By pushing the play button, you will set off on a trip through Malte Spitz's life. The speed controller allows you to adjust how fast you travel, the pause button will let you stop at interesting points. In addition, a calendar at the bottom shows when he was in a particular location and can be used to jump to a specific time period. Each column corresponds to one day.
There is an easy solution for this problem. Corporations could not store metadata for individuals. Then they wouldn't have to produce anything. They wouldn't need " to build departments, processes, training, security procedures and create for themselves and very expensive endless quagmire of bureaucracy."
If they want to keep that data, then they need to share it with the people creating such data. The other option would be to share it with everyone. Nobody would like that though. Or, when you login online to check your account, they share it there. That shouldn't be too hard.
Ninjas don't carry tic tacs
On T-Mobile, it is as simple as logging into your account on the web site, and looking at the reports. For a family plan, it lists the sender and receiver phone number of EVERY call AND text messages for everyone on the plan. These are accompanied with their time stamps, too, of course. There is also an option to download a PDF file with the "detailed" report on your bill, which contains all this information.
No idea why other carriers are claiming it is hard to deal with this sort of data.
We need to evolve to adapt to this new threat to the species, and instead of seriously *resisting* its effects on our being, we - the true power - direct the feature to our favour. If, out of the NSA catastrophe, we gain a "New Internet" wherein *everything, everywhere* for 15 years, was available to everyone, then we'd have indeed a new era in the human species. A truly evolutionary step, made by mistake - perhaps.
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
CLI is what you want and you'll see the ID of every incoming call. That's your metadata. There you go, collect your own metadata you lazy bastard.
There was an unknown error in the submission.
...Under no circumstances have we ever kept the "is unlisted" status of the call; as it's never been a datum required for any business logic, ever.
And when handling millions of calls daily, and relying on switches to read/dump data for secondary systems to process RT is a space and time sensitive process; and thus, only the absolute minimum required is kept to prevent buffer overruns in the data processing phase;
But, as the process is semi-manual to retrieve data for a given time-range I can understand their request to honor "all my metadata" as well.
Limited time-ranges as required by law enforcement is easier to obtain:
- fetch the raw hourly dump files for the time range requested
- run the script that goes through the files and formats a CSV output for any matches of the search phone number
- this process takes hours to run for a weeks worth of data as it churns through TBs of text files if it's outside the 90-day "fresh" window that is stored in a more processed state (but not kept as it's a lot of data to store for no company benefit); most requests from law enforcement only request the last 30days of calls; and this particular process is more streamlined.
- it would be entirely unrealistic to do for the lifetime of a given customer.
One point to take away from this, is that many telecom companies have no interest to keep your data. It's expensive, each item of data adds substantial more costs, overhead, and resource to manage it's storage. It also adds significant more liability as now more people have access to it internally; and safeguards and resources must be used to manage it. ....
I'd actually like to build the DB that stores both streams of data efficiently(engineering & skimmed switch) ...
With a good DB design, I'm about 99% certain you could reduce storage requirements including index sizes *significantly*, while improving access to the data for billing/queries (and other stuff) with very modest CPU overhead. It would also probably give you access to WAY more than 90 days, of course then we get into how the data could be misused. Maybe I should look to telecom for my next DB job...
As you describe it, it would be fun to define and build.
Which results in very big collections of facebook data sent to you.
There are only 12 states in the union in which that privacy argument carries legal weight. Hawaii is a middle-ground state, but a nuanced analysis would likely result in the privacy argument being overcome there, too. (The information is not protected unless it was recorded by a hidden means.) I think you need to lawyer up.
Maybe it's a good thing. It raises at least the -possibilty- that it might be hard for other people to get his data, as well.
This will then tell them that they want to get out of the data storing game.
If a company stores data about a person, then the company needs to be able to give that person access.
The only way to avoid is... don't store the data.