Domain: govtrack.us
Stories and comments across the archive that link to govtrack.us.
Stories · 28
-
Senate Passes Bill That Lets the Government Destroy Private Drones (engadget.com)
On Thursday, the Senate passed the FAA Reauthorization Act, which, among other things, renews funding for the Federal Aviation Administration and introduces new rules for airports and aircraft. But the bill, which now just needs to be signed by the president, also addresses drones. From a report: And while parts of the bill extend some aspects of drone use -- such as promoting drone package delivery and drone testing -- it also gives the federal government power to take down a private drone if it's seen as a "credible threat." The wording comes from another bill, the Preventing Emerging Threats Act of 2018, which was strongly supported by the Department of Homeland Security and absorbed into the FAA Reauthorization Act. In June, as part of its argument as to why it needed more leeway when it comes to drones, the agency said that terrorist groups overseas "use commercially available [unmanned aircraft systems] to drop explosive payloads, deliver harmful substances and conduct illicit surveillance," and added that the devices are also used to transport drugs, interfere with law enforcement and expolit unsecured networks. Video -- What Happens When a Drone Hits an Airplane Wing? -
Google AdSense Banned a Random Webpage About a 32-Year-Old Bill Because It Was About Sexual Abuse (vice.com)
An anonymous reader quotes a report from Motherboard: Earlier this week, an algorithm made an absurd choice. Google AdSense, Google's advertising program that makes up the bulk of the tech giant's advertising revenue, decided that a web page about a decades-old bill about sexual abuse was "adult content," and wasn't allowed to display ads anymore. The page, which is at least six years old and contains strictly legislative information about a bill called the "Child Sexual Abuse and Pornography Act of 1986" on free legislative research and tracking website GovTrack.us, tripped the AdSense algorithm that decides what pages are allowed to run ads. This single, very dry page being flagged as "adult content" is most likely a minor fluke in the AdSense algorithm, but it's a perfect example of how a tiny tweak in the way a platform uses automation to enforce policies can send a ripple through seemingly-unrelated parts of the internet. The page was flagged by Adsense as "policy non-compliant" on Monday, with Google citing the page's "violations" in a summary of the AdSense adult content policy. Here's what Google told GovTrack: "As stated in our program policies, we may not show Google ads on pages with content that is sexually suggestive or intended to sexually arouse. This includes, but is not limited to: pornographic images, videos, or games; sexually gratifying text, images, audio, or video; pages that provide links for or drive traffic to content that is sexually suggestive or intended to sexually arouse." The GovTrack page contains none of these, yet the page still can't run AdSense. -
Patriot Act Expansion Fails In The House (thehill.com)
An anonymous reader write: The "Anti-terrorism Information Sharing Is Strength Act" failed in the U.S. Congress on a vote held earlier this week. "Many libertarians warned of potential privacy violations if the measure went into effect," reported The Hill, "which helped prevent it from reaching the necessary two-thirds majority to pass through the fast-track process under which it was considered." The bill would've expanded the number of crimes which would trigger the expanded investigation powers, including crimes covered by the Computer Fraud and Abuse Act. "The Patriot Act should not be casually expanded," warned the House Liberty in a statement, arguing the bill would "permit the government to demand information on any American from any financial institution merely upon reasonable suspicion."
In a related story, a new campaign ad is criticizing Senator Russ Feingold for being the only Senator to vote against the original Patriot Act in October of 2001. Shipped to TV stations Thursday night, its narration begins "Islamic terrorists slaughtering innocents. And when Congress gave law enforcement the tools to keep Americans safe from international terror, only one senator voted no: Russ Feingold." After Friday's attack in Nice, Feingold's opponent attempted to reschedule the ads until a later date, but was unable to stop them from airing on at least three stations. -
Bison To Become First National Mammal Of The US (washingtonpost.com)
mdsolar quotes a report from Washington Post: North America used to be teeming with bison. But in one century, their numbers plummeted from tens of millions to just a few dozen in the wild after hunters nearly wiped out the continent's largest mammals. Now, the bison is about to become the first national mammal of the United States. The National Bison Legacy Act, which designates the bison as the official mammal of the United States, passed the House on Tuesday and the Senate on Thursday. The legislation now heads to President Obama's desk to be signed into law. At a time of political gridlock and partisan bickering, lawmakers agree on an official national mammal. The bison, which will join the bald eagle as a national symbol, represents the country's first successful foray into wildlife conservation. Lobbying for the official mammal designation was a coalition of conservationists; ranchers, for whom bison are business; and tribal groups, such as the InterTribal Buffalo Council, which wants to "restore bison to Indian nations in a manner that is compatible with their spiritual and cultural beliefs and practices." -
New Legislation Would Ban US Government From Purchasing Apple Products (arstechnica.com)
HughPickens.com writes: Cyrus Farivar reports at ArsTechnica that Congressman David Jolly has introduced the "No Taxpayer Support for Apple Act," a bill that would forbid federal agencies from purchasing Apple products until the company cooperates with the federal court order to assist the unlocking of a seized iPhone 5C associated with the San Bernardino terrorist attack. "Taxpayers should not be subsidizing a company that refuses to cooperate in a terror investigation that left 14 Americans dead on American soil," said Jolly, who announced in 2015 that he's running for Senate, joining the crowded GOP primary field to replace Sen. Marco Rubio. "Following the horrific events of September 11, 2001, every citizen and every company was willing to do whatever it took to side with law enforcement and defeat terror. It's time Apple shows that same conviction to further protect our nation today." Jolly's bill echoes a call from Donald Trump last month to boycott Apple until it agrees to assist the FBI. Not to fear, GovTrack gives Jolly's bill a 1% chance of being enacted. -
Treat Computer Science As a Science: It's the Law
theodp writes: Last week, President Obama signed into law H.R. 1020, the STEM Education Act of 2015, which expands the definition of STEM to include computer science for the purposes of carrying out education activities at the NSF, DOE, NASA, NOAA, NIST, and the EPA. The Bill was introduced by Rep. Lamar Smith (R-TX) and Rep. Elizabeth Etsy (D-CT). Smith's February press release linked to letters of support from tech billionaire-backed Code.org (whose leadership includes Microsoft President Brad Smith), and the Microsoft-backed STEM Education Coalition (whose leadership includes Microsoft Director of Education Policy Allyson Knox). -
CISPA's Author Has Another Privacy-Killing Bill To Pass Before He Retires
Daniel_Stuckey writes: "You might remember House Intelligence Chair Mike Rogers, a Republican from Michigan, from his lovely, universally-hated CISPA cybersecurity bill that would have allowed nearly seamless information sharing between companies and the federal government. You might also remember him from his c'est la vie attitude towards civil liberties in general. Well, we've got some good news and some bad news: Rogers announced today that he won't seek re-election and is instead retiring from politics to start a conservative talk radio show on Cumulus. The bad news? He's got at least one terrible, civil liberties-killing bill to try to push through Congress before he goes. Like CISPA, the newly introduced 'FISA Transparency and Modernization Act,' seeks to make it easier for the federal government to get your information from companies." -
Paul's Call To Abolish the TSA, One Year Later
A year ago today, we noted that Sen. Rand Paul of Kentucky called for the abolition of the Transportation Security Administration. It's now nearly 12 years since the hijacked-plane terror attacks of 2001; the TSA was created barely two months later, and has been (with various rules, procedures, and equipment, all of it controversial for reasons of privacy, safety, and efficacy) a major presence ever since at American commercial airports. "The American people shouldn't be subjected to harassment, groping, and other public humiliation simply to board an airplane," wrote Paul last year, and in June of 2012, he followed up by introducing two bills on the topic; the first calling for a "bill of rights" for air travelers, the other for privatizing airport screening practices. Neither bill went far. Should they have? Libertarian-leaning Paul did not succeed in knocking back the TSA, never mind privatizing its functions (currently funded at nearly $8 billion annually), though some of the things called for in his bill of rights are manifest now at least in muted form. (Very young passengers, as well as elderly passengers, face less stringent security requirements, for instance, and TSA has ended its prohibition of certain items aboard planes.) Whether you're from the U.S. or not, what practical changes would you like to see implemented? What shouldn't be on the bill of rights for airplane passengers? -
How the Syrian Games Industry Crumbled Under Sanctions and Violence
Fluffeh writes "Syria's games industry now looks like just another collateral casualty of dictator Bashar Al-Assad's struggle to hold power. 'Life for Syrian game developers has never been better,' joked Falafel Games founder Radwan Kasmiya, 'You can test the action on the streets and get back to your desktop to script it on your keyboard.' Any momentum Syria may have been building as a regional game development hub slowed considerably in 2004, when then-US President George W. Bush levied economic sanctions against the country. Under the sanctions, Syria's game developers found themselves cut off from investment money they needed to grow, as well as from other relationships that were just as important as cash. 'Any [closure of opportunity] is devastating to a budding games company as global partnerships are completely hindered,' said Rawan Sha'ban of the Jordanian game development company Quirkat. 'Even at the simplest infrastructure level, game development engines [from the US] cannot be purchased in a sanctioned country.'" -
Why CISPA Is a Really Bad Bill
We've heard recently of CISPA, the Cyber Intelligence Sharing and Protection Act, a bill currently making its way through Congress that many are calling the latest incarnation of SOPA. Reader SolKeshNaranek points out an article at Techdirt explaining exactly why this bill is bad, and how its backers are trying to deflect criticism by using language that's different and rather vague. Quoting: "The bill defines 'cybersecurity systems' and 'cyber threat information' as anything to do with protecting a network from: '(A) efforts to degrade, disrupt, or destroy such system or network; or (B) theft or misappropriation of private or government information, intellectual property, or personally identifiable information.' It's easy to see how that definition could be interpreted to include things that go way beyond network security — specifically, copyright policing systems at virtually any point along a network could easily qualify." -
Global Online Freedom Act Approved By House Committee
Fluffeh writes "While it is a bit disappointing that companies might need a law to avoid providing tools that censor free speech to overseas regimes, an updated version of a bill that's been floating around for a few years — the Global Online Freedom Act — has passed out of the House Foreign Affairs Subcommittee on Africa, Global Health and Human Rights. The version that made it out of committee took out some controversial earlier provisions that had potential criminal penalties for those who failed to report information to the Justice Department. However, the Center for Democracy and Technology has raised some concerns: 'While some companies – such as GNI members Google, Microsoft, Websense, and Yahoo! – have stepped up and acknowledged these responsibilities in an accountable way, other companies have not been so forthright. GOFA, however, is a complex bill. While it presents a number of sensible and innovative mechanisms for mitigating the negative impact of surveillance and censorship technologies, it also raises some difficult questions: can export controls be meaningfully extended in ways that reduce the spread of (to borrow words from Chairman Smith) "weapons of mass surveillance" without diminishing the ability of dissidents to connect and communicate? How can – and should – U.S. companies engage with so-called "Internet-restricting" countries?'" -
It's Not All Waste: The Complicated Life of Surplus Electronics In Africa
retroworks writes "Today's Science Daily reports on 5 new UN studies of used computer and electronics management in Africa. The studies find that about 85% of surplus electronics imports are reused, not discarded. Most of the goods pictured in 'primitive e-waste' articles were domestically generated and have been in use, or reused, for years. Africa's technology lifecycle for displays is 2-3 times the productive use cycle in OECD nations. Still, EU bans the trade of used technology to Africa, Interpol has describes 'most' African computer importers as 'criminals,' and U.S. bill HR2284 would do the same. Can Africa 'leapfrog' to newer and better tech? Or are geeks and fixers the appropriate technology for 83% of the world (non-OECD's population)? " -
Carl Malamud Answers: Goading the Government To Make Public Data Public
You asked Carl Malamud about his experiences and hopes in the gargantuan project he's undertaken to prod the U.S. government into scanning archived documents, and to make public access (rather than availability only through special dispensation) the default for newly created, timely government data. (Malamud points out that if you have comments on what the government should be focusing on preserving, and how they should go about it, the National Archives would like to read them.) Below find answers with a mix of heartening and disheartening information about how the vast project is progressing.
LoC?
by an Anonymous Reader
So how many GB/TB is a Library of Congress? :)
Or, more seriously, how big are you estimating? Are you using raw scans or some sort of compression (JPG, PNG, etc)? What resolution are you using? Do you vary the resolution depending on the document?
What sort of meta data are you putting in?
CM: The reason John Podesta and I suggested a Federal Scanning Commission in our letter at YesWeScan.Org is we really don't know how big the holdings of the government are. I can tell you that the Library of Congress is about 32 million cataloged books (a significant increase from the 6,487 books Thomas Jefferson donated to get them started). But, this is about more than books, it is about paper records, microfilmed technical papers, video, audio, photographs, and much more.
The scale is fairly vast. The Smithsonian has 137 million objects, including about 13 million images. David Ferriero, the Archivist of the United States estimates he has over 10 billion pages of text documents, 7.2 million maps, and 40 million photographs including everything from past census records to presidential dinner menus, and that includes about 7.5 million motion pictures and sound recordings. The Government Printing Office distributes their documents to the Federal Depository Library Program, and that includes over 60 million pages of collections including the Official Journals of Government such as the Federal Register. That's just scratching the surface, and we recommended a Federal Scanning Commission to begin the process of understanding what we have (and what is worth digitizing).
As to standards? There are lots of pretty good standards on how to digitize. NARA, Library of Congress, GPO all spec out document scans at 400 dpi, for example. For photographs, moving images, and other objects, there are some pretty good and pretty detailed standards at www.digitizationguidelines.gov. I know Brewster Kahle's operation and my own tend to work off those specifications (in fact Brewster does quite a bit of scanning for the government).
As to compression? Well, I've found people tend to overcompress things. That said, sometimes the initial quality isn't that great, so a 600 dpi uncompressed scan would be silly in some cases. But, for photographs I try very hard to keep the TIFF images around and not rely on JPEG. Likewise, for audio it is really nice to keep a nice 48 khz version of your file around if you can simply because if you screw up the compression maybe somebody else can do a better job in a few years. Disk space is relatively cheap, so that isn't the barrier it used to be. For video, I rip MPEG2 at whatever it is on a DVD, when I'm actually digitizing I try to get the video bitrate up to 8-10 mbps when ripping a Betacam or Umatic. Some people think that is overkill, but I'd rather be safe than sorry.
Metadata? Well, you got to have it or you're not going to get very far when it comes to access. Many librarians have made perfect the enemy of the good when it comes to metadata and have resisted any attempt at digitization because we don't have the very best metadata we might have. I'm more in the camp of scan what you have and get as much of the metadata as you can into it. For example, we have 3,200 1000-page volumes of briefs from the 9th Circuit of the U.S. Court of Appeals. We didn't have good metadata, but we had the Internet Archive scan them anyway. Then, after we got our PDF files, I shipped those off to a double-key team in India and they broke the briefs up into individual documents and typed the metadata into a spreadsheet for me, which we hope to release soon.
My point is that sometimes you can shoehorn the metadata in after the fact or you can use a variety of techniques to pull the metadata out of the documents (e.g., smart OCR). In theory, you can use crowdsourcing to get the metadata, but so far I've not had a lot of luck persuading thousands of people to spend their time doing that kind of work. A captcha is a quick thing to do and is between you and something you want, whereas entering metadata in for videos or documents is one of those civic duty things that everybody thinks everybody else should be doing.
Total size? Brewster says a book is about 400 Mbytes (though he's very quick to point out that you could put the words in all the books in the library into a terabyte and if you're distributing PDFs, you can easily throw 130,000 full-color, searchable PDFs onto a 4 TB drive). But, you were probably asking about raw data. Here's some raw numbers:
32 million books at 400 Mbytes each is 12.8 petabytes 50 million photos at 150 Mbytes each is 7.5 petabytes 10 billion pieces of paper ("records") at 100 Kbytes each is 1 petabyte 20 years of video at 8 mbps is only 630 Tbytes.
(Somebody check my math?)
If you're talking a decade-long federal digitization initiative, we're looking at well south of 50 petabytes, which seems pretty doable in this day and age!
Can the rare books collections be digitized?
by autophile
Three closely related questions about the rare books collections at the Library of Congress:
1. I know there is some kind of effort going on to digitize the rare books collections, but can it be sped up? There are many high-quality low-cost archival book scanners out there (such as the ones developed at diybookscanner.org).
2. It gets really annoying to have to receive paper copies of books when copies are requested. Why not DVDs of high-quality images?
3. Why is there no outreach by the LoC to smaller, cheaper book scanning efforts? The Internet Archive, DIYBookscanner.org, and Decapod all come to mind.
CM: In reverse order. I don't know why we aren't distributing and decentralizing our scanning efforts. The Internet Archive is a heavy-duty production shop and they do an amazing job, as do folks like Google Books and the folks digitizing things the Mormon Church. But, there are a bunch of DIY solutions and it would be really nice if we could get more people pitching in. The biggest problem on distributing the digitization efforts is quality control. I know when it comes to ripping video, I can easily teach other people how to grab an MPEG2 off a DVD, but when it comes to things like digitizing a Betacam, that takes some training. But, we're all trainable and I wish we could all do more.
Getting back paper copies of books and papers when they're doing a copy anyway is just plain dumb. Likewise with things like FOIA results. John Podesta testified before the Senate about FOIA and said if an agency answers a FOIA request, they should also post their result online so others can see it. That seems pretty obvious.
As far as digitizing rare book collections, there are some amazing pockets throughout the government but there is no real coordination and there certainly is no effort to scan at scale or to come up with a realistic national digitization strategy. That is why we called on the White House to lead the effort. Within the Library of Congress there are some amazing collections, but if you look around to places like the National Agricultural Library or the National Library of Medicine or the libraries in the service academies you'll find lots more. Some have argued that digitizing rare books is silly because the audience is just a few academics, but I can tell you from my own experience helping host the network site for the Archimedes Palimpsest that when you make this kind of information available, there is an amazing long tail.
If you scan it, they will come. And, to answer your question, if we all scan it, they will come much sooner.
Real time legislation drafting
by kerskine
Would it be possible to implement a system that would allow real-time and continuous review of legislation while it's being drafted? Much has been made over the past three years about legislation being available for review before voting by the House or Senate. The final draft for review usually is huge PDF that makes it near impossible for citizens, interest groups, and the media to thoroughly analysis in time.
CM: You want to see the sausage being made not just buy the hot dog! I'll comment on the U.S. Congress since that's the system I know best. Thomas is a pretty good system if you happen to be stuck in 1994. It does have all the amendments and the actions and the various stages that legislation go through. But, it isn't real time, more like "pretty quick." As Van Jacobson once quipped, "Same day service in a nanosecond world." And, Thomas isn't really machine processable, it is final form, usually formatted ASCII text (shades of NROFF!). People like Josh Tauberer who built GovTrack.US have spent considerable time crawling those systems and trying to get the data into regularized formats and make it available to others to reuse via APIs, but that isn't the same as exposing the inner working of the sausage factory.
Majority Leader Cantor's staff has been pushing a system to make the raw data all available in XML from the Clerk's office and I think that is a very promising initiative which hopefully will bear fruit. (They're having a February 2 conference to discuss their plans if you are interested. I have no idea if it will be streamed for those of who aren't Inside the Beltway and I don't know their schedule for moving past conferences and into production.)
Congress is a pretty complicated beast. I know some folks like Sean McGrath have had better luck with some of the state legislatures. The problem is you need to dig deep into the inner working of a legislature. In the Congress, that means you're changing things like authoring tools that are used in the Clerk's office and by all the staff members, so you have to be careful or you get a bunch of really angry Congressman yelling at you because their staff can't crank out the flavor-of-the-week in the form of a bill or amendment.
There's also a bit of an issue of will. My work with the Congress to put hearings on-line showed that you could take the official transcripts of a hearing and use those to generate closed captions on the video. All you need is the official transcript of the hearing, but in order to get those I had to execute a special Memorandum of Understanding with the House Oversight Committee. Other committees guard their transcripts jealously and won't let them out for several when. When I started processing a bunch of historical videos we purchased from C-SPAN, I went to the Government Printing Office and found that many committees never deliver their transcripts, even a decade after the fact!
How to keep track of legislative activity about open access?
by oneiros27
Recently in the federal register, there were two calls for comments about access to data and research from federally funded research:
http://federalregister.gov/a/2011-28623 [federalregister.gov] http://federalregister.gov/a/2011-28621 [federalregister.gov]
I didn't hear about these until ~4 weeks after the original announcement, and with the holidays, it was too late to try to get the societies I'm involved with to prepare and vote on official statements. Are there any places where people can get/post notices of these sorts of things so that we can stay informed and try to help influence policies?
CM: The Federal Register is getting a lot better now that it is a much more open system. The idea of "Federal Register 2.0" was a paper I wrote for the Obama transition, so it is an issue I've tracked pretty closely and frankly, I've been amazed at how much better it is now. What they did is instead of selling the raw data feed for the Federal Register for $17,000/year, they went from SGML to XML and then released the data in bulk for free. A few guys out in San Francisco were looking for something to do to enter a contest and they took that bulk data and dreamed up GovPulse.US. That was such a better version of the Federal Register that the Office of the Federal Register switched the official site over to their open source platform. My point is the tools are there to do better notification mechanisms, and I'm sure the government would welcome somebody grabbing the GovPulse.US code out of Github and making it even better.
That's the technical answer. But, the substantive answer is that there is a huge boatload of stuff in the Federal Register and it is pretty hard to figure out what to pay attention to. I also missed that particular call for comment, and I've even missed several Requests for Information coming out of places I try and pay attention to, like the White House's Office of Science and Technology Policy. And, I do this stuff full-time! Perhaps better targeted notification mechanisms are the answer. Maybe it is a social media solution, where you pay attention to things your friends are paying attention to. I hope the answer is not that the only way to pay attention is to be employed with a beltway bandit which can afford hundreds of minions that do nothing but pay attention to Washington. Indeed, there are some very fancy for-pay services from folks like Congressional Quarterly and Bloomberg that cost an arm and a leg, but I can't help but think there has to be a better way that is also open.
What do you think of corporate partnerships?
by mhh5
I'd like to know what you think about corporate partnerships in the process to get public data released. (I'm not sure if Google Patents existed before the USPTO released its databases.) Do corporations that get involved in the process tend to make the process better without question, or are there tradeoffs in some areas because the corporations always want to help but then try to retain a proprietary version of the data for themselves?
CM: The theory is that the government gets some kind of valuable service (like digitization) that the government wouldn't get otherwise so it is a "win-win." But, the reality is all too often the government gets snookered and what we do is give some corporation exclusive access to some pot of data and the government doesn't get much of anything. The deal between Amazon and the National Archives was a good example of that kind of a private fence around the public domain. With a help from Boing Boing, I started systematically purchasing those public domain videos and re-releasing them in the wild. I have no problem with Amazon selling public domain video, I just hate it when they get a de facto or a contractual exclusive. (My testimony before Congress on this subject is here.)
There are lots of other examples of government getting snookered. For example, the Government Accountability Office let Thomson West get access to 60 million or so pages of federal legislative histories. At great cost to the government, they were all packed up and dispatched to West which digitized them all and then sent them back to the government. West now sells access to his amazing database. What did the government get for it's trouble? A few logins for GAO staffers. Even members of Congress need to pay to access the database! (We have an interesting paper trail on this issue.)
I'm glad you brought up the Google Patent system because I was personally involved in making that happen and I can tell you that this one is totally legit. Jon Orwant is the lead developer on this for Google and I played a small part in helping convince the White House and the Patent Office they ought to give Jon access to their data (the heavy lifting on that deal was by Beth Noveck who was the Deputy CTO at the time). Google makes all the data they got from the Patent Office available for bulk access with no strings attached. I can vouch for that because I did a mirror of their system. Last I heard Google was sending out anywhere from 1 to 10 terabytes of data PER DAY to external sources and even normally very critical folks who work in this arena have been really happy.
The big problem in the Patent Office is their computing infrastructure is a real catastrophe. Their power plant is over 95% capacity (e.g., plug in a computer, bring the building down!) and even though the Under Secretary knew that selling DVD subscriptions was silly, he wasn't able to switch over to an FTP service. He cut the deal with Google Patent and it worked out well for the government, for Google, and for everybody else.
What's the difference between the Google deal and the Amazon deal? In the case of the Amazon and GAO/West deals, the government lawyers did all the negotiating and they were totally outsmarted by some sharks in industry. But, when government has people like Under Secretary Kappos and Beth Noveck doing the negotiating, these things can work out just fine. The key is government should partner with people who want to do public service, not people who want to service the public.
Encouraging Governments?
by theNAM666
In a city such as Nashville, things as basic as business ownership and property records are not available online. In states such as New Jersey, public records such as basic corporate filings (officers, operating address/address for service of process) are accessible only for a fee.
What concrete actions can citizens confronting such situations, take to encourage accessibility and accountability?
CM: I find you need a carrot and a stick to make this stuff happen, especially at the local level. Folks like Everyblock.Com and CodeForAmerica.Org have done great working prying some of these databases loose, but there is still lots to do.
The first thing you should do is pick up the phone (or pick up your email client) and write/call the people who run the system. Ask them if you can have access to the data. Sometimes, it is as simple as that.
Other times, though, it isn't quite as simple since they want the money (or they want the control or they think this should be done by "private industry" by which they mean some buddy who is a contractor). The nice thing about any government system is somebody usually has oversight responsibilities. So, the next step is to find a city council member of state legislator who has oversight on the agency in question and ask them.
Again, life isn't usually that simple, but sometimes you win! If you can't get anywhere that way, what I usually end up doing is basically competing with the government system. Build a proxy system like RECAPtheLaw.Org did to recycle paid documents. Or, get a sponsor and buy a reasonable number of docs and build a web site that looks like it is going to be a real production system.
Then, go back again and ask. Maybe if you have eyeballs or at least have a nice web site, that is enough to get the government moving. But, if that doesn't do the job, you may have no choice but to compete with them for real, which of course requires a big commitment in time and energy and not everybody can do that. I know in the case of the Patent Office, I started pestering them in 1993, including several times when I spent 6-figure sums purchasing their data, and it still took until 2011 to crack that nut.
The real trick is focus/obsession. Pick one thing you really care about and just keep pestering them until you crack it open. If you're surfing from one opengov problem to another, showing up for a 1-day hackathon then moving on to something else, you're not going to get anywhere. Pick something real and make it your thing.
Privately Owned, Copyrighted Law
by AdamnSelene
I think I have read that the law itself cannot be copyrighted and it should be possible to make it available available to everyone. But as a techie who drafts standards and specifications, I was wondering about how far this goes--especially since Congress recently proposed enacting some of our standards into law. (They decided not to, but they read some parts into the committee records as they debated.) Can you still accomplish your project if a governmental body adopts (or considers adopting) a privately owned, copyrighted technical reference manual or set of safety standards as administrative law (or regulations that carry the force of law)? Or would such obstacles keep you from being able to digitize all of the government's laws (and archives of proposed laws)?
CM: The idea that the law has no copyright is a fundamental part of the American system of government. That applies to states and municipalities as well. The basic decision is Wheaton v. Peters from 1834 but that decision has been reaffirmed over and over. The law is sacred in the American system. You can't have equal protection under the law or due process under the law if there is a poll tax on access to justice.
When we get to a privately developed standards however, it turns into a very interesting issue. The basic mechanism is called Incorporation by Reference. The government will take some external document (such as a model building code) and incorporate the entire text to make it the law of the land. A guy named Peter Veeck was responsible for a landmark decision in 2002 when he published the Texas Building Code which was an incorporation of a privately-developed and very expensive model code. The court ruled that while the model code had copyright, the law of the land did not.
Based on the Veeck decision, my group went and posted many of the public safety codes enacted by the states. We started by purchasing model codes, finding the incorporating legislation, and concatenating the two pieces together and posting the resulting PDFs. More recently, we've done some extensive reworking of the California public safety codes, known as Title 24, converting the entire text into valid XHTML, recoding the graphics as SVG graphics, the formulas as MathML, and regenerating the PDF documents as nicely typeset documents instead of low-quality scans. You can see this work on the web but it is also available as Google Code project.
The federal government also uses this mechanism intensively, with over 2,000 standards incorporated into the Code of Federal Regulations. This is non-trivial stuff, things like all the OSHA safety regulations. The issue was recently considered by a federal group called the Administrative Conference of the U.S. which basically rolled over and endorsed the idea that it is ok for important parts of the law to cost money. (Read EFF's protest letter if you want a good critique of what they did.)
I'm not necessarily saying that government should be able to appropriate any privately-developed standard and make it available. And, I'm not necessarily saying you want OSHA bureaucrats drafting the standards. But, I do think the big standards establishment and the government regulators have cut a deal that results in the law not being available and the costs forked off on private citizens and small business with extortionate monopoly prices. I just paid $847 for a 48-page safety standard from Underwriters Labs and $60 for 2-page safety standard from the Society of Automotive Engineers, both of which are mandated by law in the CFR. They do need money to run their operations, but let me just point out that in 2009 the 501(c)(3) nonprofit Underwriters Labs paid their CEO $2,138,984 and the nonprofit SAE paid their CEO $412,578.
Ancestry.com
by An Anonymous Reader
What is your opinion about websites like Ancestry.com which make use of public records and charge a subscription fee for access? What is the incentive for the government to migrate old documents into digital form when services like these exist? Do you think Ancestry.com should be a 501(c)(3)?
CM: I'm not a big fan of for-profit corporations that have a business model of monetizing the public domain. I'm fine if they exist and fine if they make billions of dollars, but if they are the only game in town they've taken something that belongs to all of us and and turned it into their private property.
The government got snookered on the Ancestry.Com deal. They could have insisted that the raw data be available in bulk for anybody else to use. The folks that approach the government to cut these sweetheart deals argue that is unreasonable because they need a "return on investment" and the argue that if they don't get the return on investment they won't do the deal (and by extension nobody else will do the deal).
But, government can argue much harder! For example, instead of negotiating some exclusive thing with Ancestry.Com, how come they didn't ask the Internet Archive to grab the data? Or put together something creative with a couple of foundations that would pay for the digitization in return for the kind of payback the foundations like to see (e.g., good press, photo opportunity with the President, or other tools of the trade)?
You asked if Ancestry.Com should be a 501(c)(3)? Not all nonprofits do something that I think which should be an essential part of their mission, which is allow others to compete with them. I believe providing open access to all data ought to be a precondition to getting nonprofit status (an idea that Gil Elbaz has been pushing for quite some time). A good example of a nonprofit that builds walls is Guidestar which wants to be the place where you go for all your nonprofit information. The IRS should be making all Form 990 returns of nonprofits available in bulk for anybody to use, which would knock the bottom out of Guidestar's attempts to build walls and force them to stay innovative and provide value.
Pacer Problems
by onyxruby
How much difficulty do you anticipate in getting and publishing records in Pacer? If there's one system that should be free it the decisions that our courts make and yet you are charged by the page just to view the results. Are you concerned about a court taking an unkind view on your archiving what is in Pacer?
CM: PACER is an abomination. Do they take a dim view of our efforts? Well, the Administrative Office of the U.S. Courts reacted so strongly to our efforts to make their data available that they called the FBI on Aaron Swartz and cancelled the only meaningful public access system they had, which consisted of one terminal in each of 17 public libraries around the country. In this era of rapidly decreasing costs, they just boosted their access charges from 8 cents a page to 10 cents a page, arguing that this is a bargain compared to 25 cents a page for a copy machine.
What I find so disturbing about PACER is that when we did get 20 million pages of docs, we were able to conduct a comprehensive analysis of privacy violations in the courts, an analysis that led to a nice thank-you letter from the Judicial Conference and changes in their privacy rules. In other words, only when public interest groups got access to the data did we begin to address privacy issues. Public access is not just about pro se prisoners defending themselves from a jail cell, which is the view of many in the Administrative Office of the Courts. Public access is about attempts like ours (and many other folks) to make our system of justice function better. When we say we are "an empire of laws not a nation of men" that means we write down what we are doing in our courts so that it is no longer the arbitrary decisions of individuals. The paper trail is there so we can make sure the system is functioning properly. When you limit that access to those that only have a Gold Card, you pervert democracy and you pervert justice.
This principle that access to justice shouldn't hide behind a cash register goes back to the Greeks. Theseus in Euripedes' Suppliants said "when there are no public laws, one man holds power by keeping the law all for himself, and there is no more equality. But when the laws are written, the weak man and the rich man have equal justice." The PACER system is justice for the rich man.
Steve Schultze and the team at Princeton did a lot of the heavy lifting on this issue, including the very nice RECAPtheLaw.Org system they built. They've also done a lot of financial analysis that shows that the courts are not only recovering their costs for operating the expensive PACER system, they're making a huge profit (to the tune of $100 million/year) and using their excess profits to do things like buy big-screen TVs in direct violation of the E-Government act.
The basic problem on PACER is the Judicial Conference has delegated the issue to a few techie judges who think what they've built is something great. But, PACER is a hairball of bad PERL code and the result has not served the judges, the bar, or the American people very well. My only hope is that eventually, the Judicial Conference will see that their information technology is 30 years behind the rest of the Internet and feel ashamed at the travesty they have wrought. Until then, we have RECAP.
If you're interested in the issue, a couple of resources to look at are the PACER paper trail and a bit of a rant that I delivered at the Gov 2.0 summit.
How to visualize opened data?
by hardwarejunkie9
The amount of information you're trying to free is entirely staggering and consists, largely, of tables of numbers. These numbers are incredibly significant, but people generally can't see them.
After you free all of this information and make it available to the public (as it should be), then what? What do you expect for the public to do with these numbers? Tables of information are not nearly as useful as graphs. This data needs to be seen, but, more importantly, it needs to be understood.
Do you have any ideas for how to disseminate this information? Perhaps a team-up with someone like gapminder.org's Hans Rosling might be particularly valuable for all of us.
CM: Actually, most of the data I'm looking at is not tables of numbers, it is video, images, textual documents, technical papers, maps, and books.
But, I definitely get what you're saying and there are a lot of numbers. For example, the IRS Form 990s should be structured data instead of PDF documents, so extracting the data from the mass of paper is the initial challenge. There are lots of other examples of this kind of initial extraction, getting what were printed paper docs into structured data. There are some interesting tools, such as OCRopus which does layout analysis, but there needs to be much more. One of the reason we called for a Federal Scanning Commission is that we think there is a lot of directed R&D that could not only scale up mass digitization but could also work on the important value-added of extraction of structured data and handling some of the tricky issues like detecting the presence of Social Security Numbers.
Once you have the data, as you say, then what? I'm a big fan of the idea that the government starts by providing bulk data, then they provide an API, and then maybe they also build web sites and apps and other things along with everybody else out there. That's a 3-part hierarchy that Ed Felten and some of his students developed and it should be a law that applies to all government information systems that are externally facing.
The issue here is that all too often people look at a problem like "digitize all government information" and they want to see the whole stack of the solution from one place. But, I think you can do a layered approach and count on the fact that there is always somebody smarter out there and our job is to reduce the barriers to entry. So, how would I visualize the data? I have no idea, but I'd make damned sure that folks like Martin Wattenberg at Many Eyes and Hans Rosling at Gapminder knew the data was out there and then I'd sit back and be amazed at whatever they come up with. How's that for pushing the problem downstream?
Why is data access so hard?
by CanHasDIY
Can you provide any explanation as to why it is so difficult and cost-prohibitive to obtain records from the government, especially considering the abundance of laws requiring government compliance with requests for information (AKA "Sunshine Laws")?
Is it simply a matter of government employee ineptitude, or have you found evidence of a more nefarious rationale?
CM: I get that question a lot. Why would a member of Congress take deliberate steps to stop public hearings from being available? Why would a court administrator deliberately restrict access to public court documents? Usually the answer is, as Heinlein said, "you have attributed conditions to villainy that simply result from stupidity." When I'm explaining why something is so broken on a big government system, my usual answer is that there are a lot of people still stuck in the 1970s and 1980s, when information dissemination was really, really hard and it took men in white lab coats and computers the size of freight trains to process data. In other words, the problem with a lot of folks who are government gatekeepers is they just don't get the Internet and they don't get computers. In fact, usually when some senior bureaucrat is throwing stones at me, you can find younger staffers working for them rolling their eyes.
That's an optimistic view, and if I'm right things will get better. But, I'm often wrong on my predictions of the future. (I was the guy who saw TimBL demo the web in 1992 and thought to myself "interesting, but it won't scale.")
But, there is also some more nefarious stuff happening, often the accumulation of power by being able to cut exclusive deals with contractor buddies. If your life in government consists of receiving emissaries from Lockheed Martin, maybe you think you're making everybody happy by letting them build you a $1 billion computer system. Often, you think your problems are so unique that the $1 billion solution is the only answer.
And, in some cases, as we've seen from numerous GAO reports, Inspector General reports, Congressional hearings, and newspaper articles, there are some really evil people out there who think the public domain and the government is their personal business opportunity. Looting the federal government is the kind of civic crime that ranks right up there in my book with stealing cookies from Girl Scouts and selling fake medicines to sick people.
Who is the worst?
by TheBrez
Which government agency is the worst to get information from?
CM: I don't know who the worst are (there's a lot of competition for that slot), but the ones that piss me off the most are the ones that should know better.
Public.Resource.Org is a really small operation. I'm the only staff member. My part-time sysadmin is @mdkail who is pretty busy with his day job as CIO at NetFlix. My ISP is Jim Martin and his team at ISC who are kind of busy running the F-Root. My office net is supported by the amazing systems team at O'Reilly which rents me office space at below-market rates.
I'll grant you government would have a tough time getting that kind of help. But, I'm a one-man shop and we run the 4th most popular U.S. government video channel on YouTube, we're the source for a lot of the on-line presence of the U.S. Court of Appeals, and we've supported efforts for the U.S. Congress, the White House, and the National Archives. If we can do this out of Northern California, couldn't the vast resources of the federal government in Washington, D.C. do a whole lot better than they're doing now?
For me, my current bete noir is the U.S. Congress. We got half-way through processing their archives of video from congressional hearings, publishing about 31 terabytes of data. Then, a couple of staffers decided this was a bad idea and pulled the rug out from under us. They actually decided it was a bad idea to publish video from public congressional hearings.
Like any agency, Congress is a mixed bag. We had tons of support from Darrell Issa, for example, and ran a very successful pilot project for him for a year. We talked to all sorts of people on committees and in the various agencies that support the Congress. But, at the end of the day, a couple of staff members were able to decide that the public archive shouldn't be public and they terminated our project. (If you have some time, you might like to read our rather surreal paper trail.)
So, rather than the worst, I think we need to look for the most shameful, the ones that have the privilege and the power and could easily do better. I know it is in vogue to throw stones at government in general and Washington in particular, but there are times when government can be so useful and so awe inspiring it takes your breath away. Government can be that shining city on the hill but we all have to take an active part in our government to keep those lights shining bright. -
US Senator Proposes Bill To Eliminate Overtime For IT Workers
New submitter Talisman writes "Kay Hagan (D) from North Carolina has introduced a bill to the Senate that would eliminate overtime pay for IT workers." The bill is targeted at salaried IT employees and those whose hourly rate is $27.63 or more. It seems comprehensive in its description of what types of IT work qualify — everything from analysis and consulting to design and development to training and testing. The bill even uses "work related to computers" as one of the guidelines. -
New Bill Would Require US ISPs To Retain User Info
Wesociety writes "The House Judiciary Committee, lead by Rep. Lamar Smith, is preparing a bill which would require internet service providers to retain information about their users to aid in criminal investigations. This particular bill would be a smaller part of a large measure to strengthen sanctions against acts such as child pornography. The most interesting part of this bill however is not who it targets but rather who it does not. The bill would make wireless companies exempt from the requirement to store user data." Declan McCullagh gives a fuller report at CNET. Update: 05/14 00:35 GMT by T : Note: Smith has yet to release the text of the current bill, but it seems an easy bet it will have much in common with his similar-sounding legislative push in 2007, which resulted in the unsuccessful SAFETY Act of 2009. -
Internet Blacklist Back In Congress
Adrian Lopez writes "A bill giving the government the power to shut down Web sites that host materials that infringe copyright is making its way quietly through the lame-duck session of Congress, raising the ire of free-speech groups and prompting a group of academics to lobby against the effort. The Combating Online Infringement and Counterfeits Act (COICA) was introduced in Congress this fall by Sen. Patrick Leahy (D-VT). It would grant the federal government the power to block access to any Web domain that is found to host copyrighted material without permission." -
Legislation To Make Web Devices Accessible To Disabled Users
pgmrdlm writes "In an effort to make web devices accessible to the disabled, the 21st Century Communications and Video Accessibility Act (H.R. 3101), submitted by Rep. Edward J. Markey (D-MA) passed the House of Representatives by a vote of 348 to 23. The related Senate bill has been introduced by Senator Mark Pryor (D-AR). Quoting Representative Markey's website: 'We've moved from Braille to Broadcast, from Broadband to the Blackberry. We've moved from spelling letters in someone's palm to the Palm Pilot. And we must make all of these devices accessible.' The Washington Post coverage notes, 'Some broadcasters put videos on the Internet with captions, but not all. That can make inaccessible everything from the political videos that are now common on the Web to pop culture clips that turn viral.' As someone who has 20/200 vision with my glasses on, I completely agree that the web has not been kind to individuals with various disabilities. But due to the size of the web, and the large number of different devices that access it, is it even possible to legislate something of this nature? Or should we rely on education and peer pressure on the various manufacturers?" -
Legislation To Make Web Devices Accessible To Disabled Users
pgmrdlm writes "In an effort to make web devices accessible to the disabled, the 21st Century Communications and Video Accessibility Act (H.R. 3101), submitted by Rep. Edward J. Markey (D-MA) passed the House of Representatives by a vote of 348 to 23. The related Senate bill has been introduced by Senator Mark Pryor (D-AR). Quoting Representative Markey's website: 'We've moved from Braille to Broadcast, from Broadband to the Blackberry. We've moved from spelling letters in someone's palm to the Palm Pilot. And we must make all of these devices accessible.' The Washington Post coverage notes, 'Some broadcasters put videos on the Internet with captions, but not all. That can make inaccessible everything from the political videos that are now common on the Web to pop culture clips that turn viral.' As someone who has 20/200 vision with my glasses on, I completely agree that the web has not been kind to individuals with various disabilities. But due to the size of the web, and the large number of different devices that access it, is it even possible to legislate something of this nature? Or should we rely on education and peer pressure on the various manufacturers?" -
House Overwhelmingly Passes Cybersecurity Bill
eldavojohn writes "The Caucus, a NY Times Blog, is reporting on the overwhelming majority vote (422 yeas) the House gave a new cybersecurity bill. The Cybersecurity Enhancement Act, H.R. 4061 has a number of interesting provisions. Representative Michael Arcuri, a Democrat of New York who sponsored the bill called cybersecurity the 'Manhattan Project of our generation' and estimated the US needs 500 to 1,000 more 'cyber warriors' every year in order to keep up with potential enemies. The new bill 'authorizes one single entity, the director of the National Institute of Standards and Technology, to represent the government in negotiations over international standards and orders the White House office of technology to convene a cybersecurity university-industry task force to guide the direction of future research.'" -
Senate Passes Bill Targeting College Piracy
An anonymous reader brings news that the College Opportunity and Affordability Act has passed in the US Senate and now awaits only the President's signature before becoming law. Hidden away in the lengthy bill are sections which tie college funding to "offering alternatives to illegal downloading or peer-to-peer distribution of intellectual property as well as a plan to explore technology-based deterrents to prevent such illegal activity." The EFF issued a statement expressing concern over the bill earlier this year, shortly before the House of Representatives approved it. We discussed the introduction of the bill last November. The Senate vote was 83-8, with 9 not voting. The full text of the bill is available. The relevant section is 494, at the end of the general provisions. -
Senate Passes Bill Targeting College Piracy
An anonymous reader brings news that the College Opportunity and Affordability Act has passed in the US Senate and now awaits only the President's signature before becoming law. Hidden away in the lengthy bill are sections which tie college funding to "offering alternatives to illegal downloading or peer-to-peer distribution of intellectual property as well as a plan to explore technology-based deterrents to prevent such illegal activity." The EFF issued a statement expressing concern over the bill earlier this year, shortly before the House of Representatives approved it. We discussed the introduction of the bill last November. The Senate vote was 83-8, with 9 not voting. The full text of the bill is available. The relevant section is 494, at the end of the general provisions. -
H.R. 4279 Would Establish Federal IP Cops
MrSnivvel writes "H.R. 4279, Prioritizing Resources and Organization for Intellectual Property Act of 2008, is gaining momentum in Congress. It passed the House a few days back. It would allow the Feds to seize hardware that has even one file coming from 'dubious origins,' e.g. downloaded from P2P. If passed into law, the bill would establish an Intellectual Property Enforcement Division within the office of the Deputy Attorney General. Rep. John Conyers says the goal is to 'prioritize intellectual property protection to the highest level of our government.'" -
Bill Would Bar US Companies From Net Censorship
Meredith writes "A bill that would penalize companies for assisting repressive regimes in censoring the Internet may finally be headed to a vote. The Global Online Freedom Act 'would not only prevent companies like Yahoo from giving up the goods to totalitarian regimes, but would also prohibit US-based Internet companies from blocking online content from US government or government-financed web sites in other countries.' Unfortunately, there's also a giant loophole: the president would be allowed to waive the provisions of the Act for national security purposes." -
Copyright Lobbies Threaten Federal College Funding
plasmacutter writes "The EFF is raising the alarm regarding provisions injected into a bill to renew federal funding for universities. These new provisions call for institutions of higher learning to filter their internet connections and twist student's arms over 'approved' digital media distribution services. 'Under said provision: Each eligible institution participating in any program under this title shall to the extent practicable — (2) develop a plan for offering alternatives to illegal downloading or peer-to-peer distribution of intellectual property as well as a plan to explore technology-based deterrents to prevent such illegal activity. Similar provisions in last year's bill did not survive committee, it appears however that this bill is headed toward the full house for vote.' Responding to recriminations over this threat to university funding, an MPAA representative claims federal funds should be at risk when copyright infringement happens on campus networks." We've previously discussed this topic, as well as similar issues. -
U.S. House Says the Internet is Terrorist Threat
GayBliss writes "The U.S. House of Representatives passed a bill (H.R. 1955) last month, by a vote of 404 to 6, that says the Internet is a terrorist tool and that Congress needs to develop and implement methods to combat it." -
Congress to Fight Piracy with Education Funds
Nomihn0 writes "The RIAA has announced that the House Education and Labor committee is considering an amendment, HR1689, to the Higher Education Act of 1965. The proposal would allocate federal education funds to anti-piracy measures on college campuses. Most concerning is the bill's wording. It's claimed that the proposal would 'save telecommunications bandwidth costs.' In other words, the government will fund private packet filtering and preferential bandwidth allocation. 'The Higher Education Act (HEA) generally allows schools to spend the money they receive only on certain prescribed areas such as financial aid grants and Pell loans. The new bill would allow that money to be used for more things, but does not contain a request for additional funding. Whether schools would be interested in using a limited pool of federal money to police student file-swapping remains to be seen.'" -
Bush Signs Bill Enabling Martial Law
An anonymous reader writes to point us to an article on the meaning of a new law that President Bush signed on Oct. 17. It seems to allow the President to impose martial law on any state or territory, using federal troops and/or the state's own, or other states', National Guard troops. From the article: "In a stealth maneuver, President Bush has signed into law a provision which, according to Senator Patrick Leahy (D-Vermont), will actually encourage the President to declare federal martial law. It does so by revising the Insurrection Act, a set of laws that limits the President's ability to deploy troops within the United States. The Insurrection Act (10 U.S.C.331 -335) has historically, along with the Posse Comitatus Act (18 U.S.C.1385), helped to enforce strict prohibitions on military involvement in domestic law enforcement. With one cloaked swipe of his pen, Bush is seeking to undo those prohibitions." Here is a link to the bill in question. The relevant part is Sec. 1076 about 3/4 of the way down the page. -
Congress to Overhaul Patent Law
karvind writes "According to story at law.com, 'lawmakers in Washington are considering changes to the patent code that would bring U.S. law closer to intellectual property standards in the rest of the industrialized world.' The stated result of Patent Reform Act of 2005, HR 2795 is supposed to make the system work 'more efficiently' and be 'less prone to litigation.'"