Tracking the Congressional Attention Span
Turismo writes "Ars Technica covers a new research project that uses computers to look at 70 million words from the Congressional Record. The project's goal was to track what our representatives were talking about at any given time, and researchers were able to do it without human training or intervention. From the article: '...researchers found, for instance, that "judicial nominations" have consumed steadily more Congressional attention between 1997 and 2004. In fact, the topic produced the most number of words published in a single "day" of the Congressional Record: 230,000 on November 12, 2003.' It looks like automated topic analysis has truly arrived."
Are there really that many speeches? TheyWorkForYou.com offer a similar service for the UK's Houses of Parliament, except it's done manually, and there's only a dozen volunteers working on it.
Bogtha Bogtha Bogtha
Word frequency? That is primitive given the fact that there already tools that can parse the grammar of the sentence finding relations between words.
I do not believe in karma. "Funny"=-6. Do good and forbid evil. Yours, Oft-Offtopic Flamebaiting Troll.
30 years ago, I learned in my high school civics class that any Senator or Representative can insert anything he or she wants into it at any time. Examples that were pointed out to us were speeches on the floor of the Senate that were never made, modifications to committee meetings, etc. The CR is by no means an accurate measure of anything. Except maybe the size of their combined egos.
They know, don't they, that a representative can have arbitrary text inserted in CR as if it had been read?
Also, if you watch CSPAN while Congress is in session, in the evenings you'll see long stretches with just a few people who are delivering their rants into a nearly empty room. Can that be separated from the rest of the text?
A lot of this is substantive depate in disguise. They may literally be arguing whether Bill 1 gets an hour of debate or a day of debate, but what they're really trying to do is either kill it or give it room to breathe.
WeRelate.org - wiki-based genealogy
I just finished reading John Stossel's new book (quite good, though not as good as his first). He has a section in it about the Congressional Record.
If you think the Congressional Record is an accurate account of what happens in Congress you are dead wrong. Congressmen use taxpayer dollars to manipulate the Record because there is nothing that says they can't. They insert bogus info, like "Congressman Bob Blowhard addressed the House with a commendation for the 4-H Club of Woohah, Oklahoma". Which never really happened but it makes Senator Blowhard look good with his constituents. They also change the words of what they really said on the floor to make themselves sound better.
Here is a blog post mentioning the problem Stossel brings up and a small excerpt
Carl
Vote Libertarian
Progress = Walk forward
Congress = Walk together/with
'-gress' is from the Latin 'gradi' (to walk)/gradus (a step). 'ghredh' comes from the same place, but 'go' obviously makes less sense than 'walk' (which it also means).