Tracking the Congressional Attention Span

← Back to Stories (view on slashdot.org)

Tracking the Congressional Attention Span

Posted by ryuzaki0 on Thursday August 3, 2006 @11:52PM from the hot-button-issues dept.

Turismo writes "Ars Technica covers a new research project that uses computers to look at 70 million words from the Congressional Record. The project's goal was to track what our representatives were talking about at any given time, and researchers were able to do it without human training or intervention. From the article: '...researchers found, for instance, that "judicial nominations" have consumed steadily more Congressional attention between 1997 and 2004. In fact, the topic produced the most number of words published in a single "day" of the Congressional Record: 230,000 on November 12, 2003.' It looks like automated topic analysis has truly arrived."

33 of 89 comments (clear)

Min score:

Reason:

Sort:

Or Maybe Not.. by vjmurphy · 2006-08-03 23:55 · Score: 5, Funny

"It looks like automated topic analysis has truly arrived."

Not according to my in-depth research. Looks like "automated topic analysis" isn't arriving at all.

http://www.google.com/search?hl=en&q=%22automated+ topic+analysis%22&btnG=Google+Search

--
Vincent J. Murphy
Spandex Justice
1. Re:Or Maybe Not.. by Faylone · 2006-08-04 00:36 · Score: 3, Insightful
  
  The first thing you should do when the exact phrase can't be found is try searching for just all of the words...59.5 millions results, and the first one seems to be quite accurate. http://www.google.com/search?hl=en&q=automated+top ic+analysis&btnG=Google+Search
Pro-Gress vs Con-Gress by jkrise · 2006-08-04 00:03 · Score: 3, Insightful

If Pro is the opposite of Con.... what'd Congress mean?

Just playing around with some silly words... do we need to analyse what Congressmen speak, to understand their intent or motivations? Following the money would be a better option.. and we'll find a Very High Attention Span for words like money, dollars and Big Bucks..

--
If you keep throwing chairs, one day you'll break windows....
1. Re:Pro-Gress vs Con-Gress by AltGrendel · 2006-08-04 00:24 · Score: 4, Interesting
  
  I would think that both "follow the money" and this type of record analysis would be the best thing. Think of it as the money as the input and the speeches as the output.
  Correlate the two and you'd really have something.
  No, not that. What I meant was who outside of Congress is trying to push buttons, and who inside Congress is helping them. Also, you'd be able to watch for what you may consider important topics to see how they are dealt with.
  
  --
  The simple truth is that interstellar distances will not fit into the human imagination
  - Douglas Adams
2. Re:Pro-Gress vs Con-Gress by andrewman327 · 2006-08-04 00:37 · Score: 3, Insightful
  
  It's more complicated than simply money issues, but I agree that this study does not prove much. If congressmen want to stay in office they need votes and they need to do what they think will get them elected. If you want to know what has your elected official's attention, it is much more direct to look them up in Project Vote Smart.
  
  --
  Information wants a fueled airplane waiting at the hangar and no one gets hurt.
3. Re:Pro-Gress vs Con-Gress by Whiney+Mac+Fanboy · 2006-08-04 01:25 · Score: 3, Interesting
  
  If Pro is the opposite of Con.... what'd Congress mean?
  
  Just 'cause I was mildly interested (I've heard that wordplay before), I read the dictionary's entries for progress, congress and con.
  
  And it appears con (when used in pros/cons of a decision) is different to con/com (the prefix).
  
  The gress suffix is from indo-european ghredh (to go) and pro & con have root meanings of advance/forward & to meet respectively.
  
  Progress = Forward Go.
  Congress = Meet Go.
  
  --
  There are shills on slashdot. Apparently, I'm one of them.
4. Re:Pro-Gress vs Con-Gress by General+Wesc · 2006-08-04 05:14 · Score: 2, Informative
  
  Progress = Walk forward
  Congress = Walk together/with
  
  '-gress' is from the Latin 'gradi' (to walk)/gradus (a step). 'ghredh' comes from the same place, but 'go' obviously makes less sense than 'walk' (which it also means).
I'm not sure this is the best metric... by PixelPirate · 2006-08-04 00:11 · Score: 5, Insightful

Think about it: "Who thinks we should elect Joe Six-Pack"
Lots of talk, chit-chat, chatter, etc...

"Okay, now who would want to oppose the True American, Patriot, Love, Peace Act*"
Cricket! Cricket!
*And of course this Act happens to have about thirty-thousand ridders attached to it...
1. Re:I'm not sure this is the best metric... by OzPeter · 2006-08-04 01:42 · Score: 3, Funny
  
  Man I am impressed, I thought the US was done with its colonial masters, but here you are saying that congress wants to play Cricket instead of voting on a bill. Bring it on .. but be warned, no-one the US could field will ever match the Don.
  
  --
  I am Slashdot. Are you Slashdot as well?
Opposite Side by BigNumber · 2006-08-04 00:11 · Score: 4, Funny

So what scored the lowest? Individual freedoms? Constitutional Rights? Fair use?
1. Re:Opposite Side by adam1234 · 2006-08-04 00:35 · Score: 2, Insightful
  
  Judicial nominations affect all three to a very large degree.
TheyWorkForYou.com by Bogtha · 2006-08-04 00:17 · Score: 4, Informative

Even with a large team of grad students at their disposal, researchers find it difficult to tag more than a small subset of the speeches in question

Are there really that many speeches? TheyWorkForYou.com offer a similar service for the UK's Houses of Parliament, except it's done manually, and there's only a dozen volunteers working on it.

--
Bogtha Bogtha Bogtha
1. Re:TheyWorkForYou.com by joeljkp · 2006-08-04 03:07 · Score: 2, Informative
  
  As I understand it, they're searching through the Congressional Record, not simply transcripts of congressional speeches. The CR is full of pages upon pages of stuff that doesn't get spoken anywhere, except for saying "please insert this into the Record" (or something to that effect). The CR has full text of speeches, letters, reports, amendments, textual evidence, etc.
  
  --
  WeRelate.org - wiki-based genealogy
Tracking the Congressional Attention Span by DrLang21 · 2006-08-04 00:19 · Score: 5, Funny

The conclusion. Congress has ADD, just like me.

--
I see the glass as full with a FoS of 2.
1. Re:Tracking the Congressional Attention Span by Roody+Blashes · 2006-08-04 00:35 · Score: 2, Funny
  
  Suffer from a crippling neurolgical disorder that makes you incapable of focusing long enough to make an informed decision, become a congressperson.
  
  Spend the first forty years of your life a drunken, aimless slob with no business acumen and bad manners, become president.
  
  I think I'm beginning to see some cracks in the stout facade of democracy here....
  
  --
  If you haven't foed me yet, what are you waiting for?
2. Re:Tracking the Congressional Attention Span by Anonymous Coward · 2006-08-04 00:47 · Score: 2, Funny
  
  You'd think the American public would wake up and... OH Chappelle's show is on!
sophistication, ha? by mapkinase · 2006-08-04 00:22 · Score: 2, Funny

Does it take really a sophisticated tool to count the number of times "judicial" and "nominations" appearing in the same sentence?

May be the submitter forgot to cite a little bit more impressive examples?

--
I do not believe in karma. "Funny"=-6. Do good and forbid evil. Yours, Oft-Offtopic Flamebaiting Troll.
1. Re:sophistication, ha? by mapkinase · 2006-08-04 00:34 · Score: 2, Informative
  
  The data generating process that motivates our model is the following. On each day that
  Congress is in session a legislator can make speeches. These speeches will be on one of a finite
  number K of topics. The probability that a randomly chosen speech from a particular day will be
  on a particular topic is assumed to vary smoothly over time. At a very coarse level, a speech can
  be thought of as a vector containing the frequencies of words in some vocabulary. These vectors of
  word frequencies can be stacked together in a matrix whose number of rows is equal to the number
  of words in the vocabulary and whose number of columns is equal to the number of speeches. This
  matrix is our outcome variable. Our goal is to use the information in this matrix to make inferences
  about the topic probabilities and how they change over time as well as the topic membership of
  individual speeches.
  
  Word frequency? That is primitive given the fact that there already tools that can parse the grammar of the sentence finding relations between words.
  
  --
  I do not believe in karma. "Funny"=-6. Do good and forbid evil. Yours, Oft-Offtopic Flamebaiting Troll.
They're asking the wrong question by MarkusQ · 2006-08-04 00:22 · Score: 4, Funny

Great. Now we know what congress has been talking about.
Big deal.
Wake me up when you can tell me what in the hell they were thinking.
--MarkusQ
P.S. Other than how to make sure that they--and Joe Lieberman--get re-elected I mean.
Process Process Process by LaughingCoder · 2006-08-04 00:26 · Score: 4, Interesting

That disease that has so infected business - talking about process (how) rather than products (what) - is readily apparent in Congress as well. I added up the percentages of the "Procedural [HouseKeeping]" categories (egads, there were 6 different line items - not sure what the distinctions were), and it was 50%!!! So, for half the time Congress is talking about *how* they are going to talk about things. Ugggh. I suppose, as one who believes that the less the government does, the better, I should be happy. But oh, the global warming from all that hot air!

--
The more you regulate a company, the worse its products become.
1. Re:Process Process Process by AnyoneEB · 2006-08-04 02:33 · Score: 2, Insightful
  
  And yet they still do not have reasonable rules like forbidding riders...
  
  --
  Centralization breaks the internet.
2. Re:Process Process Process by joeljkp · 2006-08-04 03:10 · Score: 2, Informative
  
  A lot of this is substantive depate in disguise. They may literally be arguing whether Bill 1 gets an hour of debate or a day of debate, but what they're really trying to do is either kill it or give it room to breathe.
  
  --
  WeRelate.org - wiki-based genealogy
Reading the Record???? by jackb_guppy · 2006-08-04 00:30 · Score: 5, Insightful

The congressional record is a false document of what happened in congress. Watch C-Span one day and hear each person request "Unamious support to change or extend". This allows 30 second comment say to begainst the bill to become a 2 hr speech to supporting the bill WITHOUT editing marks.

This program may count time on paper but can not count time that congress is actually spending.
Garbage in, Garbage Out. by GeneralCern · 2006-08-04 00:41 · Score: 2, Interesting

The record isn't actually way they talked about...

...it's what they want you to THINK they talked about.

http://www.townhall.com/columnists/JohnStossel/200 6/05/31/myths_and_lies_on_the_record
The CR is anything but accurate by sgtrock · 2006-08-04 00:52 · Score: 5, Informative

30 years ago, I learned in my high school civics class that any Senator or Representative can insert anything he or she wants into it at any time. Examples that were pointed out to us were speeches on the floor of the Senate that were never made, modifications to committee meetings, etc. The CR is by no means an accurate measure of anything. Except maybe the size of their combined egos.
1. Re:The CR is anything but accurate by Peyna · 2006-08-04 02:37 · Score: 2, Informative
  
  You're half-right there. They can get anything they want into the record without actually having to say it in front of everyone. This is good in some respects, because it allows that person to be officially on the Congressional Record on a particular point without having to tie up the time of the congressional body.
  
  However, they can't modify things that are already in the record (at least, not without being subjected to censure or other punishment).
  
  --
  What?
Congress Zeitgeist by Speare · 2006-08-04 01:12 · Score: 2, Interesting

So in web2.0 terms, this is Google Zeitgeist meets the Statistically Improbable Phrase analysis like you see on Amazon. Find pairs or sets of words which are out of the statistical norm for English, then start to track their rise and fall among the "marketplace of ideas" in Congress. Also, on the c|net news site, they have two graph views to visualize connections between similar-topic stories or often-viewed "hot" stories.

It would be interesting to see how many phrases are just a matter of the odd language that Congress uses. There's a stock metaphorical phrase for just about anything, and there are also a lot of phrases that are steeped in tradition which often get misunderstood by layfolk.

--
[ .sig file not found ]
Congressional Record vs. what's actually said... by jejones · 2006-08-04 01:13 · Score: 4, Informative

They know, don't they, that a representative can have arbitrary text inserted in CR as if it had been read?

Also, if you watch CSPAN while Congress is in session, in the evenings you'll see long stretches with just a few people who are delivering their rants into a nearly empty room. Can that be separated from the rest of the text?
Re:Corrupting the judiciary is a strong focus now. by ryturner · 2006-08-04 01:27 · Score: 2, Insightful

Any sources to back up that statement?
See also: Clustering senators by votes & topic by Anonymous Coward · 2006-08-04 02:01 · Score: 5, Interesting

You might also be interested in another topic model that not only automatically discovers topics, but also automatically discovers topic-specific groupings of the senators by their votes. http://www.cs.umass.edu/~mccallum/papers/grouptopi c_linkkdd05.pdf "Group and Topic Discovery from Relations and Text."

It uses not only word data (from the text of 16 years worth of bills voted on in the U.S. Senate), but also the senator's voting records.

For example, you can see that Sen. Chafee (R-RI) (who was mentioned on this morning's NPR as a "liberal Republican") actually does fall into a cluster of Democrats, not fellow Republicans. When automatically discovering topics using word data alone (without the votes, as does the wustl.edu paper above) the topics on this Senate data are reasonably coherent, but the topics created by this "Group-Topic" new model are even more interesting because their discovery is driven by the need to predict the votes as well as the words. For example, "Social Security" doesn't appear in the old model, but pops out clearly in the new model because it has such a distinct voting pattern.

Some of the other results are also pretty interesting---on Education and Domestic policy the Republicans are more split than the Democrats (forming 3 groups, to the Democrats 1 group). On other topics, the split is the other way around.

Using the same technique, there is also an analysis of 60 years worth of voting records from the U.N. On the topic of "human rights", Nicaragua, Papua, Rwanda, Swaziland and Fiji all get clustered together---ouch!
Congressional Record *IS* false by donutz · 2006-08-04 03:25 · Score: 2, Interesting

If you want more proof, read this article by John Stossel, which takes a look at what the "Congressional Record" is really all about. Or like parent says, watch CSPAN.
But the Congressional Record is faked by carlivar · 2006-08-04 04:02 · Score: 3, Informative

I just finished reading John Stossel's new book (quite good, though not as good as his first). He has a section in it about the Congressional Record.

If you think the Congressional Record is an accurate account of what happens in Congress you are dead wrong. Congressmen use taxpayer dollars to manipulate the Record because there is nothing that says they can't. They insert bogus info, like "Congressman Bob Blowhard addressed the House with a commendation for the 4-H Club of Woohah, Oklahoma". Which never really happened but it makes Senator Blowhard look good with his constituents. They also change the words of what they really said on the floor to make themselves sound better.

Here is a blog post mentioning the problem Stossel brings up and a small excerpt

Carl

--
Vote Libertarian
Does it track... by SlowEmotionReplay · 2006-08-04 07:38 · Score: 2, Funny

which orifice they were talking out of?