Did You Do the Long Form?
mliu sent in: "An interesting article about how with modern methods it could be theoretically possible to link census data back to a person and the steps the Census Bureau is taking to prevent this." The marketers know so much now that even the general data the Census Bureau releases could possibly be linked up with Credit Bureau data... ouch.
Couldn't the FBI concievably get much much more information about you than could be revealed on a Census form, through, say, Carnivore, illegal wiretapping, and other agencies through Echelon?
The Short Form asks only about number of people living in the house, their names, ages, relationship to head of house, and for some bizarre reason, if they are Hispanic or not. Not to be too politically incorrect, but when I was a Census taker in 1980, minorities constituted the overwhelming bulk of my 'mop-up' efforts, and like as not they would not reveal a thing to me when I asked them those simple questions. Some kicked me off of their property, refusing even the most basic questions ('do you live here?'). I think they trust their government less than white folk, at least in this instance.
But now, in part thanks to the internet, what the US Census can collect on an Individual is much less than what a corp can get simply by asking.
The Long Form is fillied with innocuous questions like how long it takes to get to work and if you can speak a different language. Even though they ask how much you make and how much your property is worth, that's not a whole lot different than the questionnaires you routinely get from, say, Yahoo! or Amazon.
What the government Does do better than the corps is survey Each and Every household in the country, creating a valuable aggregated dataset that shows demographics and such. But they publish this information and make it available as a govt service.
So I really don't see any need to panic here. In fact, the article is not about what it says its about. If you read the whole article you discover that the intro is just a hook: it's not about them giving your data to a corp, its about them blurring individual stats to Preserve the integrity of individuals.
The govt is very concerned that citizens will not perform their constitutional duty to be enumerated. They are scared enough to blur their own stats. Isn't this good news for the paranoid?
Hell, the paranoid didn't fill out their form anyway...
But we are talking about the long form, here
SDMI: Finally! Music that won't rip or burn! Brought to you by the fine folks at RIAA.
Only 10 percent? Well, considering that the US has about 275 million citizens, I can sleep easier at night knowing that only 27 million of these can be re-identified.
Identifies well with others. The Linux Pimp
--It's Pimptastic!--
I guess you don't complain about badly planned roads or government services then. After all, it takes a lot of information to get these things right - without the census how is anyone going to know how to find the best location for new high schools or the best transport option for a town?
I admit having a fine for not filling in this info is ridiculous, but is avoiding the risk of someone sending you junk mail worth the extra cost of bad government planning? You may also think some of the questions are too nosey and perhaps they are for you. However, some areas may appreciate having translated materials at the local government offices in the native languages of local populations.
I can see the justification for the paranoia of some US citizens against their government. However, censuses (censii?) do have a worthwhile purpose and you may be disadvantaging your community by not participating.
This is a bit of a distortion. The phrase "actual Enumeration" is part of the sentence:
IOW, the "actual Enumeration" refers to the process of taking the Census, and does not mean that only a direct counting of individuals is allowable. IMO the recent Supreme Court ruling to the contrary was exactly the same kind of political decision that resulted in us getting our current president. That said, the Constitution says that the Census must be performed in order to apportion seats in the House of Representatives, but it does not exclude its use for other purposes. In fact, it says that it is to be carried out as Congress directs, so Congress clearly has the right to design the Census as it sees fit.
There's no point in questioning authority if you aren't going to listen to the answers.
Theoretically, the civil fine for not answering questions is $100 per question refused. Criminal refusal can be punished by a fine up to $500 per question, although the Census Bureau only has sought criminal convictions twice.
One of the convictions is U.S. v. Rickenbacker from 1960 or 1961, appealed to the Second Circuit Court of Appeals and the conviction was upheld.
Folks forget that it is constitutional for Congress to pass laws (13 USC 1 et seq.) which regulate the Census, just as an "actual enumeration" is required; the courts consistently uphold this.
Tell Congressfolks like the libertarian Ron Paul about your concerns.
--
Gleepy the Hen. More intelligent than the average hen.
Oh, my. I guess I would find your argument way more convincing if there were even a shred of evidence that what you say is true. Seriously, I see the same old tripe parroted around by people who should really know better than this. So, as my daily dose of public service, let me point you towards an on-line article by Anderson and Fienberg that gives a solid introduction to the issues involved. To cut to the chase here, it is incredibly difficult to find a trained statistician who believes that the naive counting approach to the census could not be substantially helped by incorporating some form of sampling procedure. The problem, of course, is that it is incredibly difficult to find any *two* trained statisticians who agree with each other on what the best procedure would be, which gave politicians an easy way out ("See? Even the experts disagree...").
To give a little bit more away, the primary use of sampling being contemplated was actually targeted at non-respondents. The problem here is that while you could follow up on most non-respondents very easily, there are a lot of them, so it takes time. And, the more time that passes before you follow-up, the worse the data get, and what you really end up with is a systematic undercount. Basically everybody on all sides of the political debate understands and agrees about this. What you should do about it is, of course, where the real fighting starts.
Babar
This has been originally modded as funny, but this kind of thing does happen. There is a serious problem between paper records getting mixed up, and other snafues, and that is documented with something just as mundane as credit records.
Now you include this with the idea of medical records, and it can get very messy very quickly. I do not know if it really happened, but it is completely believable.
In some ways, usenet is worse. There was a story a year or two ago about some exec at a major dot com who erased everything he wrote in the archives of the WELL, on the basis that it might be used in-appropriately against him, stuff he said when he was a freshman in college and stupid, etc.
Usenet does not have such an erase option, not that I know of. And neither do these databases. You do not have any legal recourse I know of to fix false or messed up data in the medical records. This is very real.
"It is a greater offense to steal men's labor, than their clothes"
So, if I give you my zip code, day of birth, and age, there is a good chance that you can identify me uniquely based on publically accessible records. Any additional information, like income level, first name, hobbies, years of residence in this zip code, marital status, gender, etc. makes such identifications very reliable.
There isn't a lot of anonymity, either off-line or on-line.
I am in awe.
For moderators who don't get the joke, back in WW2, the Census provided information on the location of citizens of Japanese descent, to be imprisoned for the crime of having inscrutably slanted eyes.
I don't like fish. Reverse the fish to e-mail.
All the other information is used by the government so that they can make more informed decisions when they decide what government services to provide to who and how much to spend. Well, at least that's the theory, but I don't think letting them make those decisions without the information would improve things any.
I see even classic Slashdot is now pretty much unusable on dial up anymore.
You betcha!
The funny thing is that I never got a targetted ad that I cared enough about to respond to (except the "I hunt gay pedophiles, give me money" one, but that response was to the FBI). So, my question is this: why do these people continue. Are they finding some secret population of rich stupid people (or poor, even stupider people)?
Ok, targetted ads asside, why should I worry? Well here' my list:
- Da Gub'mint decides that <insert sub-culture group here> are evil and we must have a "war on <insert subculture group here>". I belong to a few subcultures, so this worries me (when they come for the people who write crypto in Perl, I'll be the first against the wall).
- I really don't want my application for a home loan getting turned down because I happen to be a Linux user, and they default on loans
.0002 percent more often than the baseline. - I worry about just how much of my life will be on that piece of paper the guy across the table is holding in a job interview.
This all seems remote and unlikely now, but so did reverse-engineering the Census data 10 years ago....I trade my grocery cards every once in a while. Come to think of it I started by borrowing a card from someone else, and traded that, so they don't have my name, and my data is mixed with someone else's.
I also try to buy most of my stuff that has advertised that they don't take those cards. Poked fun at the sillyness of the whole thing. (Their major compitor in town switched cards a couple years back so everyone had to get new cards)
... it is incredibly difficult to find a trained statistician who believes that the naive counting approach to the census ould not be substantially helped by incorporating some form of sampling procedure. The problem, of course, is that it is incredibly difficult to find any *two* trained statisticians who agree with each other on what the best procedure would be ...
I must disagree with you. The issue is not that sampling COULD be more accurate. The issue is that corrupt politicians could more easily distort the results to their own advantage.
The test of any law is not how it would work if it were administered by honest people, but how much havoc it could cause if administered by DIShonest tyrants.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
That's probably why most (all the one's I've seen at least) credit card applications require your social security number to be put on the app in the first place.
There's 10 types of people in this world, those who understand binary and those who don't.
/.
/. If the government wants us to respect the law, it should set a better example.
... [census bureau administration told their workers that] this is the LAST census.
Wishful thinking by the Clinton regime.
They were trying to set a precedent that the census could be done by statistical sampling. If they had succeeded, they could then set up a "sampling" organization and make up any numbers they wanted.
If you think jerrymandering distorts the electoral process, imagine what would happen if the party in power could assign any population numbers they wanted (within reason) to each state, thus changing the state's number of representatives and electoral votes.
Fortunately the courts slapped 'em down.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
There's a similar thing happend in Britain where a company has published "UK-Info Disk", they basically take all the divergent and distributed information from electoral roles, land registry, tax registration, private marketing databases, phone books and then combine and link all the information together on Ordinent Survey maps, so basically you have find out huge amounts of information from a simple postcode (zip).
There's also cracks of the program that let you back trace the database and do any number of reverse lookups (criminals find this especially useful). It seems the developers purposely put these weaknesses into the product as hidden features.
Because the data is legally obtained in the UK then sent to the Caymen Isles or processing it basically circumvents all the British Data Protection and Freedom of Information Laws.
The Info Disk company have spun off 192.com which offers similar services, ironically they advertise the "The Big Breach" book from a former MI6 officer on their front-page, a book which is somewhat forbidden in the UK... however 192.com are hardly champions of free-speech when you delve into the infringing and questionable practices they use.
For all the paranoia that's being spouted here, it's nice to hear that the census is taking steps to deal with the issue, which is actually the main thrust of the article. The Census department itself is the one that stepped up and admitted that this is an issue, not some third party that's threatening to break everyone's privacy. They're deliberately fuzzing the data to remove individually identifying data, and even taking steps that may decrease it's usefullness (like letting it out in bigger aggregate blocks) in order to make it even harder. Perhaps the best quote was at the end of the article:
That does not sound like a government agency that's eager to breach people's privacy.
There's no point in questioning authority if you aren't going to listen to the answers.
The whole point is it's COERCED!!!
I frequently tell companies what I want. I often take the initiative. But when I don't fill out their forms I'm not fined $100 per question!
---
>80 column hard wrapped e-mail is not a sign of intelligent
>80 column hard wrapped e-mail is not a sign of intelligent
>life
...was their keen interest in what time I go to work, how many hours per day I work, what days I work, and how long the commute takes.
Damn, I saw "Enemy of the State", and that information looked like the stuff you'd need to make sure that a person would be unavailable for a few hours while they check out your house.
There's no way in hell I'm going to officially register the time periods when my house will go unmonitored.
Dewey, what part of this looks like authorities should be involved?
But it should make you a little queasy. That meta-self that runs your life has been out there since the first person started collecting data. It isn't YOU that walks into a bank and asks for a loan. It isn't your suit that gets you that loan.
John Q. Banker smiles at the physical YOU and then goes and finds out about your meta-self. This person has much more clout in the world than you ever will. This person is your credit rating, your pay stubs, etc. That person means so much more.
And now the census. A huge compilation of data. The pot of gold at the end of the advertising/data mining rainbow. Of COURSE they will find a way to use it. It is just to valuable to the open market. This is an advertisers dream. Targeted information on a broad scale down to the very last detail.
And so the real question here lies in not whether the motivations are just for doing this. We made this situation by having a free market system. The question is what the census will do to protect that data, or how they will re-work their questions to protect the individual. Otherwise, there will be a huge resistance to ever filling out a census form again.
THAT would be a shame, because the census really does some good for people, as big and lumbering as it is.
There's nothing Intelligent about Intelligent Design.
When the census polls me, I tell them:
"There are people of voting age residing here. There are people who will be of voting age by the next census."
That's all they're constitutionally entitled to know. (Actually, it's even more than they're entitled to know. The first half is sufficient.)
They try to make it SOUND like you have to answer all the rest of the questions or face a fine. You don't.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Those are the ones that you have zero control over. Some #$%^@ accidentally put it into a shared medical database tha tI have AIDS. Now I can't get insurance. Banks won't give me loans nor credit cards. I have tests proving I don't have AIDS yet I cannot cleanse this false information because there is no way I can even know every medical data warehouse that has the info. It's like it's it's been posted to USENET. I send cancel messages but the original post still manages to live on all over the place.
The Census Bureau says it's your civic duty to answer these snooping questions. In reality, it's your patriotic duty to refuse to answer. You can strike a blow for privacy, equality, and liberty by declining to answer every question on the Census form except the one required by the Constitution: How many people live in your home?
The U.S. Constitution says the purpose of the Census is to make an "actual enumeration." That is, to take an accurate count of Americans for the purpose of apportioning congressional districts. But the federal government has gone far beyond that mandate. The long version of the Census -- which one in every six households will receive -- contains a whopping 52 questions. That's 51 more than the Constitution requires. Maybe that's why compliance with the Census had plummeted to just 65% by 1990.
Unfortunately, the government has ways of making you talk. Title 13, Chapter 7 of the U.S. code mandates a $100 fine for those who decline to answer Census questions. What kind of government demands, under penalty of law, reams of personal data -- including racial characteristics -- from its citizens? Ours does. That's why it's time for some polite, patriotic civil disobedience. If you care about privacy, genuine equality, and old-fashioned American liberty, the arrival of the Census form is your chance to literally stand up and be counted.
Tell them how many people live in your home, and that's all. Maybe $100 is a small price to pay for making a principled stand for privacy and freedom.
-snellac
Just my $0.02
------------
CitizenC
"There are people of voting age residing here. There are people who will be of voting age by the next census."
Oops. Make that:
"There are [M] people of voting age residing here. There are [N] people who will be of voting age by the next census."
I keep forgetting that angle brackets delimit HTML tags and I hit the "submit" rather than the "preview" button. (I'd do the hack to get the angle brackets in there but I'm not at the computer where I made the note on how to do it.)
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
do_rant(Census)
{
It never ceases to amaze me how so many really smart people on here can overlook the basic point.
People bitch and moan about credit card companies wanting to know this that and the other. We bitch and moan about internet surveys that we fill out to register for a free service or even one we pay for.
We get upset and afraid of the US Census asking us the same questions. Why?
When Visa, or the New York times wants to get all that information they ask for it. We can decline to give it to them. Sure, we don't get The Times or we don't get our credit card, but we don't HAVE to give them the information.
We are required by law to tell the US Census these things. That's really very terrifying. Sure, the penality for not filling out this that and the other is 100 bucks. But what's to keep Congress from deciding that the penality needs to be stiffer? The voters? The public? The concerned citizen?
Oh please.
As much as I'd like to say that these things make a difference they don't. Election participation is in the toilet. Those that do participate vote (for the most part) party line tickets and blindly follow a party because that's what they've always done.
There's an escape. There always is. But the problems we face now are a symptom of letting democracy slip from the hands of the many into the hands of the few. The Census can do this... well... great. Who does it benefit? It benefits the companies who get access to the data.
The US Census: Marketing Department. Your tax dollars at work.
return;
}
This has been another useless post from....
Killfile(TGK)
No trees were killed in the creation of this post. However, many electrons were inconvenienced.
I remember when the census forms arrived at my residence. There was a paragraph stating that it is illegal for me to lie or to decline to report my census information. Is this incorrect? Could I in fact just simply refuse to participate?
- tokengeekgrrl
right....except that they have the resources to match information. If they have your name and address they can match a social security number to it, and they assume that you forgot to enter it. So they send you the information back with all the information.
2600 may be wrong in the article, but for the most part i have found their articles to be true.
To read it for yourself, just go to any barnes and nobles and pick up a copy for 5 bucks....the current one is blue with a picture of bellsouth on the cover.
Imagine if the census was taken by examining the contents of people's refrigerators. From just a look inside the whole story about them is revealed. Whether a person is single or married. If they have kids and how old they are. How much money they made. Race, age, male, female, how many people are in the house hold--all is revealed by the fridge. The good thing about this is that nobody has to say a word to the bean counters. It's just a knock at the door and the guy says, "Hi, I'm from the Census and I'm here to photograph your refrigerator."
Correlating information used to be hard, back when computers were big and expensive - businesses could still do it, but it had to be financially worthwhile to dedicate time from that 10-MIPS multi-megabuck mainframe which had two megs of memory, 250MB DASD, and a tape drive. That machine now fits in your pocket, and your desktop machine can do amazing queries with free data from the internet and cheap mapping programs - any data that's been collected can pretty much be correlated with anything else, and the only way to prevent that is not to collect it in the first place.
Remember that the laws protecting privacy of census data aren't graven in stone - they apply only until Congress feels like changing them because they've got some political goal or other. And the US military got access to census data in the 40s to use it for arresting Japanese-Americans because of their race - in spite of the 2000 Census bragging about how nobody's violated their privacy in 50 years, which is since 1950, after the war was over....
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
One application of the famous min cut/max flow algorithm in graph theory (wooo!) is to consistently round matrix entries. i.e. (tricky definition) The rows sums and column sums of the rounded elements in the matrix are equal to the rounded row and column sums of the unaltered elements of the matrix.
I hear you asking. Who cares?
One application of this sort of matrix rounding is
publishing confidential survey data. Rounding can disguise the data, so that it is not traceable to any particular individual.
The reference below describes that the class of matrix rounding problems is equivalent to the flow problem in certain capacitated network flow models. Feasible flows can be found using the 'min cut/max flow' algorithm. The author proves that there is always a feasible solution for a subclass where marginal totals are uniformly confined.
reference:
Management Science, Vol. 12, No. 9, May 1966
Bacharach, Michael. "Matrix Rounding Problems"
since it is an old reference, I will attempt to convey the setup. The network is constructed as follows:
The matrix to round, A, has individual entries, A(ij) arranged in rows and columns...
Make a set X, consisting of a node, i, for each row in the matrix. Make a set Y consisting of a node, j, for each column in the matrix.
Add an arc (i,j) representing each matrix entry A(ij). (Note X union Y is a bipartite graph)
Add a source node s and attach it to each element in X, and a sink node t and attach it to each element in Y, the resulting arcs (s,i) represent each row sum and arcs (j,t) each column sum. The lower and upper bounds for all arcs should correspond to the lower and upper rounding limits of the original matrix entry, or the rounding limits for each row/column sum as appropriate.
Any feasible (integer) flow from s to t, will correspond to an acceptable (integer) rounding of values. (And a maximum flow will give us a feasible flow.)
OK, I worked with Census data 10+ years ago (I put the US, Australian, UK and other censuses onto CDROMs - product was called SuperMap) so I know a bit about the data and the formats it's available in. And yes, the data I used was the most detailed that they release.
All censuses worldwide only release summary detail for various sizes of geographic areas, and they release detailed info only for larger areas (eg detailed crosstabs are available at county level and state level, whereas at zip code level only very basic info is relased).
Generally any number which is less than a lower limit (typically 15) is ALWAYS rounded to 0, and all other numbers are statistically rounded (eg Aus census rounds to multiples of 3, that is 37 is rounded to 36 two-thirds of the time, and to 39 one-third of the time). In addition to this automated pass, the data is hand-massaged by statisticians to obscure anything else that they think may identify individuals.
So some of the examples quoted above (eg "the number of families with one Asian parent, one Jewish parent and 4 children is 1") are pure and simple wrong - census summary data just does not make this information available.
Remember that some of the first "computers" (mechanical tabulators) were actually built for the US census in 1890. The statisticians have known for a long time that computers are used to interpret the data (using the census data off microfiche was never a real option), techniques may change, but the staff spend a lot of time and effort working to prevent releasing data on individuals.
As for all the "I don't fill in the forms" people, a census is one of the primary defined tasks of governments, and not just counting the people. How is a government supposed to plan what facilities are required at what places if they don't know what type of people live where ? Plotting population trends across the years is how you figure out where to build schools, roads, retirement homes, etc.
There are much better ways to identify individuals than using Census data...
T
This is why, hypothetically of course, that one might want to tell a census worker that one is one's own relative, and that the relative who is the actual owner of the house is overseas and that one is just visiting the place to look after it. No one is presently residing there. Poof, no census form required. Not that I'd ever do this, nosiree. I trust my government.
untill they can link to slashdot accounts...
Je t'aime Stéphanie
It could be possible to do the same with Censuses (Censi? Censum?), if it's not illegal... Tell the government what you want them to hear... I don't know what this means legally in the USA or in my own Australia, but I think the idea could have merit...
rr
Quidquid latine dictum sit, altum videtur.
The processes they described DO corrupt the data for statisitcal analysis.
When they switch some of the entries among people on the same block you can't get proper results when looking for correlations.
When they add noise your statistical significance tresholds rise and you can't detect small effects.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
The above should be modded up even higher than it is. As much as I hate government regulation, the credit/medical/rejection database industry needs to be regulated with the kind of scrutiny we only wish the nuclear power industry was regulated with.
A goverment agency or administrative court needs to have the power to issue "cancel" requests for this kind of thing. Any database vendor found to be carrying the "cancled" information after 90 days should be fined $100k per incident, payable to the person listed, and forced to remove the data with 7 days or face another $100k fine, with an injunction to follow that the data be removed THAT DAY and if found the next day, fined $500k and barred from buying, selling or collecting information until the error is corrected. All databases carrying medical, credit or other information use to deny, limit or otherwise screen access to credit, medical or insurance services should be required to be run against the "cancel" database every year and ALWAYS before being sold AND after being bought before they may be used as a screening tool.
It's too easy for the rejection industry (banks, medical, insurance, et al) to get bad data into their systems and not take it out or somehow keep reintroducing it (usually by swapping data with their fellow purveyors of rejection data). The USENET analogy is perfect.
Did you know that most people can be identified by only their birthday, area code, and 1 or 2 interests. Yet another reason to lie on on-line regestration forms. C:\ C:\Dos C:\dos\run C:\dos\get\shot\with\a\shotgun
C:\
C:\Dos
C:\dos\run
Sigs are against my religion
So, from other posts, it's been brought to my attention that it's in the Constitution, even, that the government hold a Census. Refusing to cooperate in the Census is actually 'unconstitutional', rather than illegal.
On the other hand, Census information is used to determine the number and nature of Congressional seats and representations. It may also be used to dictate reapportioning of taxes according to some formula. It may also be involved in a handful of other useful things, but any more is wild speculation.
In the letter of the Constitution then, it would seem you need to tell them how many people live in your household, and maybe how many are of voting age. In the spirit of the Constitution, answering the Census is 'supposed' to help make our government more responsive and adapted to your country's needs.
-AS
-AS
*Pikachu*
there's an article in the current issue of 2600 on how to get anybody's credit information.
:) )
Basically the method is:
1. Get a MasterCard/Visa application, whatever.
2. Enter the target's current address into the "Previous Address" section on the application.
3. Enter your address (or the dropsite address) in the "Current Address" section.
4. Enter their name and birthday in the information section.
5. Leave the rest of the application blank. (You don't want the application accepted, and if it's accepted you'll be in a shitload of trouble.
6. The agency will match the person with their name and the "previous address" but because no income information is mentioned they'll reject the application.
7. By law, the agency is required to send a notice of rejection to you., which will have the person's social security number on it.
And once you have a social security number, you're set: Go get a driver's license, and you're a new man (or woman, if that's they way you swing
Discalaimer: This is provided for informational purposes only, I do not condone any misuse of such information...yada yada
I sure hope this was NOT the last census.
I enjoy living in a Republic. How can I be represented if I am not counted?
-Peter
"There is no number '1.'"