Domain: ibm.com
Stories and comments across the archive that link to ibm.com.
Stories · 981
-
After 23 Years, IBM Sells Off Lotus Notes (techcrunch.com)
"IBM has agreed to sell select software products to HCL Technologies," writes Slashdot reader virtig01. "Included among these is everyone's favorite email and calendaring tool, Lotus Notes and Domino." TechCrunch reports: IBM paid $3.5 billion for Lotus back in the day. The big pieces here are Lotus Notes, Domino and Portal. These were a big part of IBM's enterprise business for a long time, but last year Big Blue began to pull away, selling the development part to HCL, while maintaining control of sales and marketing. This announcement marks the end of the line for IBM involvement. With the development of the platform out of its control, and in need of cash after spending $34 billion for Red Hat, perhaps IBM simply decided it no longer made sense to keep any part of this in-house. As for HCL, it sees an opportunity to continue to build the Notes/Domino business. "The large-scale deployments of these products provide us with a great opportunity to reach and serve thousands of global enterprises across a wide range of industries and markets," C Vijayakumar, president and CEO at HCL Technologies, said in a statement announcing the deal. -
IBM Aims To Meld AI With Human Resources With Watson Suite (zdnet.com)
PolygamousRanchKid shares a report from ZDNet (with some commentary): IBM has launched a unit designed for human resources to better find talent and recruit using artificial intelligence. The company is wrapping its latest HR effort, dubbed IBM Talent & Transformation, which includes select Watson services. According to IBM, its suite of AI tools can help HR become a growth engine to enable digital transformation. AI can be used to revamp workflow, employee engagement, recruitment and retention while providing a more diverse workforce. (I can still program Fortran; I learned it from Forman S. Acton -- does that make me diverse enough?) Big Blue's Talent & Transformation suite includes a Watson Talent Suite that rolls up behavioral science, AI and psychology and applies it to HR. (Sounds like the recipe for The Apocalypse to me.) IBM Garage, which serves as a test bed to meld HR, AI and culture, will also be available. (Garage? It sounds like the creepy CRISPR basement of a mad scientist to me.) -
IBM Unveils the 'World's Smallest Computer' (mashable.com)
On the first day of IBM Think 2018, the company's flagship conference, IBM has unveiled what it claims is the world's smallest computer. It's smaller than a grain of salt and features the computer power of the x86 chip from 1990. Mashable first spotted this gem: The computer will cost less than ten cents to manufacture, and will also pack "several hundred thousand transistors," according to the company. These will allow it to "monitor, analyze, communicate, and even act on data." It works with blockchain. Specifically, this computer will be a data source for blockchain applications. It's intended to help track the shipment of goods and detect theft, fraud, and non-compliance. It can also do basic AI tasks, such as sorting the data it's given. According to IBM, this is only the beginning. "Within the next five years, cryptographic anchors -- such as ink dots or tiny computers smaller than a grain of salt -- will be embedded in everyday objects and devices," says IBM head of research Arvind Krishna. If he's correct, we'll see way more of these tiny systems in objects and devices in the years to come. It's not clear yet when this thing will be released -- IBM researchers are currently testing its first prototype. -
IBM's Watson Is Going To Space (thenextweb.com)
Yesterday, IBM announced it would be providing the AI brain for a robot being built by Airbus to accompany astronauts aboard the International Space Station (ISS). "The robot, which looks like a flying volleyball with a low-resolution face, is being deployed with Germany astronaut Alexander Gerst in June for a six month mission," reports The Next Web. "It's called CIMON, an acronym for Crew Interactive Mobile Companion, and it's headed to space to do science stuff." From the report: It'll help crew members conduct medical experiments, study crystals, and play with a Rubix cube. Best of all, just like "Wilson," the other volleyball with a face and Tom Hanks' costar in the movie Castaway, CIMON can be the astronauts' friend. According to an IBM blog post: "CIMON's digital face, voice and use of artificial intelligence make it a 'colleague' to the crew members. This collegial 'working relationship' facilitates how astronauts work through their prescribed checklists of experiments, now entering into a genuine dialogue with their interactive assistant." -
IBM's Watson Is Going To Space (thenextweb.com)
Yesterday, IBM announced it would be providing the AI brain for a robot being built by Airbus to accompany astronauts aboard the International Space Station (ISS). "The robot, which looks like a flying volleyball with a low-resolution face, is being deployed with Germany astronaut Alexander Gerst in June for a six month mission," reports The Next Web. "It's called CIMON, an acronym for Crew Interactive Mobile Companion, and it's headed to space to do science stuff." From the report: It'll help crew members conduct medical experiments, study crystals, and play with a Rubix cube. Best of all, just like "Wilson," the other volleyball with a face and Tom Hanks' costar in the movie Castaway, CIMON can be the astronauts' friend. According to an IBM blog post: "CIMON's digital face, voice and use of artificial intelligence make it a 'colleague' to the crew members. This collegial 'working relationship' facilitates how astronauts work through their prescribed checklists of experiments, now entering into a genuine dialogue with their interactive assistant." -
IBM Open Sources 'WebSphere Liberty' For Java Microservices and Cloud-Native Apps (techrepublic.com)
An anonymous reader quotes TechRepublic: On Wednesday, IBM revealed the Open Liberty project, open sourcing its WebSphere Liberty code on GitHub to support Java microservices and cloud-native apps. The company created Liberty five years ago to help developers more quickly and easily create applications using agile and DevOps principles, according to an IBM developerWorks blog post from Ian Robinson, WebSphere Foundation chief architect at IBM... Developers can also choose to move to the commercial versions of WebSphere Liberty at any time, he noted, which include technical support and more specialized features... "We hope Open Liberty will help more developers turn their ideas into full-fledged, enterprise ready apps," Robinson wrote. "We also hope it will broaden the WebSphere family to include more ideas and innovations to benefit the broader Java community of developers at organizations big and small."
IBM argues that Open Liberty, along with the OpenJ9 VM they open sourced last week, "provides the full Java stack from IBM with a fully open licensing model."
Interestingly, Slashdot ran a story asking "IBM WebSphere SE To Be Opened?" -- back in 2000. -
IBM Unveils Blockchain As a Service Based On Open Source Hyperledger Fabric Technology (techcrunch.com)
IBM has unveiled its "Blockchain as a Service," which is based on the open source Hyperledger Fabric, version 1.0 from The Linux Foundation. "IBM Blockchain is a public cloud service that customers can use to build secure blockchain networks," TechCrunch reports, noting that it's "the first ready-for-primetime implementation built using that technology." From the report: Although the blockchain piece is based on the open source Hyperledger Fabric project of which IBM is a participating member, it has added a set of security services to make it more palatable for enterprise customers, while offering it as a cloud service helps simplify a complex set of technologies, making it more accessible than trying to do this alone in a private datacenter. The Hyperledger Fabric project was born around the end of 2015 to facilitate this, and includes other industry heavyweights such as State Street Bank, Accenture, Fujitsu, Intel and others as members. While the work these companies have done to safeguard blockchain networks, including setting up a network, inviting members and offering encrypted credentials, was done under the guise of building extra safe networks, IBM believes it can make them even safer by offering an additional set of security services inside the IBM cloud. While Jerry Cuomo, VP of blockchain technology at IBM, acknowledges that he can't guarantee that IBM's blockchain service is unbreachable, he says the company has taken some serious safeguards to protect it. This includes isolating the ledger from the general cloud computing environment, building a security container for the ledger to prevent unauthorized access, and offering tamper-responsive hardware, which can actually shut itself down if it detects someone trying to hack a ledger. What's more, IBM claims their blockchain product is built in a highly auditable way to track all of the activity that happens within a network, giving administrators an audit trail in the event something did go awry. -
IBM's Project Intu brings Watson's Capabilities To Any Device (siliconangle.com)
IBM has launched a new system-agnostic platform called Project Intu with which it aims to bring "embodied cognition" to a range of devices. From a report on SiliconAngle: In IBM's parlance, "cognitive computing" refers to machine learning. The idea behind Project Intu is that developers will be able to use the platform to embed the various machine learning functions offered by IBM's Watson service into various applications and devices, and make them work across a wide spectrum of form factors. So, for example, developers will be able to use Project Intu's capabilities to embed machine learning capabilities into pretty much any kind of device, from avatars to drones to robots and just about any other kind of Internet of Things' device. As a result, these devices will be able to "interact more naturally" with users via a range of emotions and behaviors, leading to more meaningful and immersive experiences for users, IBM said. What's more, because Project Intu is system-agnostic, developers can use it to build cognitive experiences on a wide range of operating systems, be it Raspberry PI, MacOS, Windows or Linux. Project Intu is still an experimental platform, and it can be accessed via the Watson Developer Cloud, the Intu Gateway and also on GitHub. -
IBM Gives Everyone Access To Its Five-Qubit Quantum Computer (fortune.com)
An anonymous reader writes: IBM said on Wednesday that it's giving everyone access to one of its quantum computing processors, which can be used to crunch large amounts of data. Anyone can apply through IBM Research's website to test the processor, however, IBM will determine how much access people will have to the processor depending on their technology background -- specifically how knowledgeable they are about quantum technology. With the project being "broadly accessible," IBM hopes more people will be interested in the technology, said Jerry Chow, manager of IBM's experimental quantum computing group. Users can interact with the quantum processor through the Internet, even though the chip is stored at IBM's research center in Yorktown Heights, New York, in a complex refrigeration system that keeps the chip cooled near absolute zero. -
Consumers Expect Their Cars To Become Mini Data Centers (networkworld.com)
coondoggie writes: Many consumers expect self-driving cars to become common in the not-too-distant future -- cars that diagnose problems without human intervention, cars that adapt to a particular driver's behaviors and react to its environment. Those are some of the conclusions from IBM's 'Auto 2025: A New Relationship – People and Cars' research involving 16,000 global consumers who were asked how they expect to use vehicles in the next ten years. IBM found consumers have the expectation that cars will soon communicate with other vehicles and infrastructure around them, integrating easily into a broader collection of traffic. More than a third of consumers said they'd be likely to allow collection of their driving data to support these services -- a notable figure, given that IBM is partnering with Ford to do exactly that. -
The Mainframe Is Dead! Long Live the Mainframe!
HughPickens.com writes The death of the mainframe has been predicted many times over the years but it has prevailed because it has been overhauled time and again. Now Steve Lohr reports that IBM has just released the z13, a new mainframe engineered to cope with the huge volume of data and transactions generated by people using smartphones and tablets. "This is a mainframe for the mobile digital economy," says Tom Rosamilia. "It's a computer for the bow wave of mobile transactions coming our way." IBM claims the z13 mainframe is the first system able to process 2.5 billion transactions a day and has a host of technical improvements over its predecessor, including three times the memory, faster processing and greater data-handling capability. IBM spent $1 billion to develop the z13, and that research generated 500 new patents, including some for encryption intended to improve the security of mobile computing. Much of the new technology is designed for real-time analysis in business. For example, the mainframe system can allow automated fraud prevention while a purchase is being made on a smartphone. Another example would be providing shoppers with personalized offers while they are in a store, by tracking their locations and tapping data on their preferences, mainly from their previous buying patterns at that retailer.
IBM brings out a new mainframe about every three years, and the success of this one is critical to the company's business. Mainframes alone account for only about 3 percent of IBM's sales. But when mainframe-related software, services and storage are included, the business as a whole contributes 25 percent of IBM's revenue and 35 percent of its operating profit. Ronald J. Peri, chief executive of Radixx International was an early advocate in the 1980s of moving off mainframes and onto networks of personal computers. Today Peri is shifting the back-end computing engine in the Radixx data center from a cluster of industry-standard servers to a new IBM mainframe and estimates the total cost of ownership including hardware, software and labor will be 50 percent less with a mainframe. "We kind of rediscovered the mainframe," says Peri. -
Data Storage Pioneer Wins Millennium Technology Prize
jones_supa (887896) writes "The British scientist Stuart Parkin, whose work made it possible for hard disks to radically expand in size, has been awarded the Millennium Technology Prize (Millennium-teknologiapalkinto). Professor Parkin's discoveries rely on magneto-resistive thin-film structures and the development of the giant magnetoresistance (GMR) spin-valve read head. These advances allow more information to be stored on each disk platter. Technology Academy Finland — the foundation behind the award — justifies the prize by saying that Parkin's innovations allow us to store large volumes of data in cloud services." He is currently working on Racetrack memory, which would obsolete flash and hard disks (and probably even RAM). -
Fifty Years Ago IBM 'Bet the Company' On the 360 Series Mainframe
Hugh Pickens DOT Com (2995471) writes "Those of us of a certain age remember well the breakthrough that the IBM 360 series mainframes represented when it was unveiled fifty years ago on 7 April 1964. Now Mark Ward reports at BBC that the first System 360 mainframe marked a break with all general purpose computers that came before because it was possible to upgrade the processors but still keep using the same code and peripherals from earlier models. "Before System 360 arrived, businesses bought a computer, wrote programs for it and then when it got too old or slow they threw it away and started again from scratch," says Barry Heptonstall. IBM bet the company when they developed the 360 series. At the time IBM had a huge array of conflicting and incompatible lines of computers, and this was the case with the computer industry in general at the time, it was largely a custom or small scale design and production industry, but IBM was such a large company and the problems of this was getting obvious: When upgrading from one of the smaller series of IBM computers to a larger one, the effort in doing that transition was so big so you might as well go for a competing product from the "BUNCH" (Burroughs, Univac, NCR, CDC and Honeywell). Fred Brooks managed the development of IBM's System/360 family of computers and the OS/360 software support package and based his software classic "The Mythical Man-Month" on his observation that "adding manpower to a late software project makes it later." The S/360 was also the first computer to use microcode to implement many of its machine instructions, as opposed to having all of its machine instructions hard-wired into its circuitry. Despite their age, mainframes are still in wide use today and are behind many of the big information systems that keep the modern world humming handling such things as airline reservations, cash machine withdrawals and credit card payments. "We don't see mainframes as legacy technology," says Charlie Ewen. "They are resilient, robust and are very cost-effective for some of the work we do."" -
Over 20% of Online Black Friday Sales Came From Mobile Devices
cagraham writes "According to IBM's latest Data Benchmark report, 21.8% of all online Black Friday sales were made from mobile devices. Mobile traffic, meanwhile, accounted for 39.7% of all Black Friday traffic. Interestingly, iOS users accounted for 18.1% of online sales, while Android users accounted for just 3.5%. The data come from IBM's real-time monitoring over 800 U.S. online retailers. The report also notes that tablets generated less traffic than smartphones, but accounted for almost twice the number of sales. Overall, online sales for Black Friday grew 18.9% year-over-year." -
IBM Opens Up POWER Architecture For Licensing
New submitter HAL11000 was the first of many to write with news that IBM and others have formed a new consortium to license the POWER architecture to third parties "IBM puts up POWER architecture for licensing and announces the OpenPower Consortium with Google, Nvidia, Mellanox, and Tyan." Quoting El Reg: "The plan, according to McCredie, is to open up the intellectual property for the Power architecture and to allow customizations by licensees, just like ARM Holdings has done brilliantly with its ARM processors ... Nvidia is very excited about the prospects of marrying Power processors and Nvidia GPUs for both HPC and general purpose systems. ... Tyan will presumably be working on alternative motherboards to the ones that IBM has manufactured for its own use." There are mentions of the POWER firmware being "open sourced," but it is unclear if that actually means Open Source or something more like the Open Group's definition of open (vendors only). -
IBM Buys Dallas Based Softlayer For $2 Billion
An anonymous reader writes "IBM this morning announced a deal to acquire the Dallas based hosting company Softlayer, the largest privately held cloud computing provider in the world. Formerly known as The Planet, they have a dark past and hopefully a bright future. Interesting that ISS and Softlayer will now be under the same roof. 'IBM will integrate SoftLayer’s public-cloud services with its own IBM SmartCloud portfolio. In theory, that will allow IBM to more speedily deliver a combination of private, public and hybridized cloud platforms to business clients. CloudLayer features include the ability to deploy virtual cloud servers (with processors 2.0GHz or faster), a content-delivery network with scalability and security, an object-storage platform based on OpenStack Object Storage, and private-cloud solutions.'" -
Why the 'Star Trek Computer' Will Be Open Source and Apache Licensed
psykocrime writes "The crazy kids at Fogbeam Labs have a new blog post positing that there is a trend towards advanced projects in NLP, Information Retrieval, Big Data and the Semantic Web moving to the Apache Software Foundation. Considering that Apache UIMA is a key component of IBM Watson, is it wrong to believe that the organization behind Hadoop, OpenNLP, Jena, Stanbol, Mahout and Lucene will ultimately be the home of a real 'Star Trek Computer'? Quoting: 'When we talk about how the Star Trek computer had “access to all the data in the known Universe”, what we really mean is that it had access to something like the Semantic Web and the Linked Data cloud. Jena provides a programmatic environment for RDF, RDFS and OWL, SPARQL and includes a rule-based inference engine. ... In addition to supporting the natural language interface with the system, OpenNLP is a powerful library for extracting meaning (semantics) from unstructured data - specifically textual data in an unstructured (or semi structured) format. An example of unstructured data would be the blog post, an article in the New York Times, or a Wikipedia article. OpenNLP combined with Jena and other technologies, allows “The computer” to “read” the Web, extracting meaningful data and saving valid assertions for later use.'" Speaking of the Star Trek computer, I'm continually disappointed that neither Siri nor Google Now can talk to me in Majel Barrett's voice. -
IBM Dipping Chips In 'Ionic Liquid' To Save Power
Nerval's Lobster writes "IBM announced this week that it has developed a way to manufacture both logic and memory that relies on a small drop of 'ionic liquid' to flip oxides back and forth between an insulating and conductive state without the need to constantly draw power. In theory, that means both memory and logic built using those techniques could dramatically save power. IBM described the advance in the journal Science, and also published a summary of its results to its Website. The central idea is to eliminate as much power as possible as it moves through a semiconductor. IBM's solution is to use a bit of 'ionic liquid' to flip the state. IBM researchers applied a positively charged ionic liquid electrolyte to an insulating oxide material — vanadium dioxide — and successfully converted the material to a metallic state. The material held its metallic state until a negatively charged ionic liquid electrolyte was applied in order to convert it back to its original, insulating state. A loose analogy would be to compare IBM's technology to the sort of electronic ink used in the black-and-white versions of the Kindle and other e-readers. There, an electrical charge can be applied to the tiny microcapsules that contain the 'ink,' hiding or displaying them to render a page of text. Like IBM's solution, the e-ink doesn't require a constant charge; power only needs to be applied to re-render or 'flip' the page. In any event, IBM's technique could conceivably be applied to both mobile devices as well as power-hungry data centers." -
IBM Predicts the Next 5 Years of Computing
SternisheFan writes "Shaun McGlaun of Slashgear writes: IBM has offered up its annual list of five innovations that will change our lives within five years. IBM calls the list the 'IBM 5 in 5.' The list covers innovations that IBM believes that the potential change the way people work, live, and interact over the next five years. The five innovations IBM lists this year include touch, sight, hearing, taste, and smell. " -
Are Indian High Schoolers Manning Your IBM Help Desk?
theodp writes "IBM CEO Virginia M. Rometty's Big Blue bio boasts that she led the development of IBM Global Delivery Centers in India. In his latest column, Robert X. Cringely wonders if customers of those centers know what they're getting for their outsourcing buck. 'Right now,' writes Cringely, 'IBM is preparing to launch an internal program with the goal of increasing in 2013 the percentage of university graduates working at its Indian Global Delivery Centers (GDCs) to 50 percent. This means that right now most of IBM's Indian staffers are not college graduates. Did you know that? I didn't. I would be very surprised if IBM customers knew they were being supported mainly by graduates of Indian high schools.'" -
IBM Creates 'Breathing' High-Density Lithium-Air Battery
MrSeb writes "As part of IBM's Battery 500 project — an initiative started in 2009 to produce a battery capable of powering a car for 500 miles — Big Blue has successfully demonstrated a light-weight, ultra-high-density, lithium-air battery. In it, oxygen is reacted with lithium to create lithium peroxide and electrical energy. When the battery is recharged, the process is reversed and oxygen is released — in the words of IBM, this is an 'air-breathing' battery. While conventional batteries are completely self-contained, the oxygen used in a lithium-air battery comes from the atmosphere, so the battery itself can be much lighter. The main thing, though, is that lithium-air energy density is a lot higher than conventional lithium-ion batteries: the max energy density of lithium-air batteries is theorized to be around 12 kWh/kg, some 15 times greater than li-ion — and more importantly, comparable to gasoline." -
Carl Malamud Answers: Goading the Government To Make Public Data Public
You asked Carl Malamud about his experiences and hopes in the gargantuan project he's undertaken to prod the U.S. government into scanning archived documents, and to make public access (rather than availability only through special dispensation) the default for newly created, timely government data. (Malamud points out that if you have comments on what the government should be focusing on preserving, and how they should go about it, the National Archives would like to read them.) Below find answers with a mix of heartening and disheartening information about how the vast project is progressing.
LoC?
by an Anonymous Reader
So how many GB/TB is a Library of Congress? :)
Or, more seriously, how big are you estimating? Are you using raw scans or some sort of compression (JPG, PNG, etc)? What resolution are you using? Do you vary the resolution depending on the document?
What sort of meta data are you putting in?
CM: The reason John Podesta and I suggested a Federal Scanning Commission in our letter at YesWeScan.Org is we really don't know how big the holdings of the government are. I can tell you that the Library of Congress is about 32 million cataloged books (a significant increase from the 6,487 books Thomas Jefferson donated to get them started). But, this is about more than books, it is about paper records, microfilmed technical papers, video, audio, photographs, and much more.
The scale is fairly vast. The Smithsonian has 137 million objects, including about 13 million images. David Ferriero, the Archivist of the United States estimates he has over 10 billion pages of text documents, 7.2 million maps, and 40 million photographs including everything from past census records to presidential dinner menus, and that includes about 7.5 million motion pictures and sound recordings. The Government Printing Office distributes their documents to the Federal Depository Library Program, and that includes over 60 million pages of collections including the Official Journals of Government such as the Federal Register. That's just scratching the surface, and we recommended a Federal Scanning Commission to begin the process of understanding what we have (and what is worth digitizing).
As to standards? There are lots of pretty good standards on how to digitize. NARA, Library of Congress, GPO all spec out document scans at 400 dpi, for example. For photographs, moving images, and other objects, there are some pretty good and pretty detailed standards at www.digitizationguidelines.gov. I know Brewster Kahle's operation and my own tend to work off those specifications (in fact Brewster does quite a bit of scanning for the government).
As to compression? Well, I've found people tend to overcompress things. That said, sometimes the initial quality isn't that great, so a 600 dpi uncompressed scan would be silly in some cases. But, for photographs I try very hard to keep the TIFF images around and not rely on JPEG. Likewise, for audio it is really nice to keep a nice 48 khz version of your file around if you can simply because if you screw up the compression maybe somebody else can do a better job in a few years. Disk space is relatively cheap, so that isn't the barrier it used to be. For video, I rip MPEG2 at whatever it is on a DVD, when I'm actually digitizing I try to get the video bitrate up to 8-10 mbps when ripping a Betacam or Umatic. Some people think that is overkill, but I'd rather be safe than sorry.
Metadata? Well, you got to have it or you're not going to get very far when it comes to access. Many librarians have made perfect the enemy of the good when it comes to metadata and have resisted any attempt at digitization because we don't have the very best metadata we might have. I'm more in the camp of scan what you have and get as much of the metadata as you can into it. For example, we have 3,200 1000-page volumes of briefs from the 9th Circuit of the U.S. Court of Appeals. We didn't have good metadata, but we had the Internet Archive scan them anyway. Then, after we got our PDF files, I shipped those off to a double-key team in India and they broke the briefs up into individual documents and typed the metadata into a spreadsheet for me, which we hope to release soon.
My point is that sometimes you can shoehorn the metadata in after the fact or you can use a variety of techniques to pull the metadata out of the documents (e.g., smart OCR). In theory, you can use crowdsourcing to get the metadata, but so far I've not had a lot of luck persuading thousands of people to spend their time doing that kind of work. A captcha is a quick thing to do and is between you and something you want, whereas entering metadata in for videos or documents is one of those civic duty things that everybody thinks everybody else should be doing.
Total size? Brewster says a book is about 400 Mbytes (though he's very quick to point out that you could put the words in all the books in the library into a terabyte and if you're distributing PDFs, you can easily throw 130,000 full-color, searchable PDFs onto a 4 TB drive). But, you were probably asking about raw data. Here's some raw numbers:
32 million books at 400 Mbytes each is 12.8 petabytes 50 million photos at 150 Mbytes each is 7.5 petabytes 10 billion pieces of paper ("records") at 100 Kbytes each is 1 petabyte 20 years of video at 8 mbps is only 630 Tbytes.
(Somebody check my math?)
If you're talking a decade-long federal digitization initiative, we're looking at well south of 50 petabytes, which seems pretty doable in this day and age!
Can the rare books collections be digitized?
by autophile
Three closely related questions about the rare books collections at the Library of Congress:
1. I know there is some kind of effort going on to digitize the rare books collections, but can it be sped up? There are many high-quality low-cost archival book scanners out there (such as the ones developed at diybookscanner.org).
2. It gets really annoying to have to receive paper copies of books when copies are requested. Why not DVDs of high-quality images?
3. Why is there no outreach by the LoC to smaller, cheaper book scanning efforts? The Internet Archive, DIYBookscanner.org, and Decapod all come to mind.
CM: In reverse order. I don't know why we aren't distributing and decentralizing our scanning efforts. The Internet Archive is a heavy-duty production shop and they do an amazing job, as do folks like Google Books and the folks digitizing things the Mormon Church. But, there are a bunch of DIY solutions and it would be really nice if we could get more people pitching in. The biggest problem on distributing the digitization efforts is quality control. I know when it comes to ripping video, I can easily teach other people how to grab an MPEG2 off a DVD, but when it comes to things like digitizing a Betacam, that takes some training. But, we're all trainable and I wish we could all do more.
Getting back paper copies of books and papers when they're doing a copy anyway is just plain dumb. Likewise with things like FOIA results. John Podesta testified before the Senate about FOIA and said if an agency answers a FOIA request, they should also post their result online so others can see it. That seems pretty obvious.
As far as digitizing rare book collections, there are some amazing pockets throughout the government but there is no real coordination and there certainly is no effort to scan at scale or to come up with a realistic national digitization strategy. That is why we called on the White House to lead the effort. Within the Library of Congress there are some amazing collections, but if you look around to places like the National Agricultural Library or the National Library of Medicine or the libraries in the service academies you'll find lots more. Some have argued that digitizing rare books is silly because the audience is just a few academics, but I can tell you from my own experience helping host the network site for the Archimedes Palimpsest that when you make this kind of information available, there is an amazing long tail.
If you scan it, they will come. And, to answer your question, if we all scan it, they will come much sooner.
Real time legislation drafting
by kerskine
Would it be possible to implement a system that would allow real-time and continuous review of legislation while it's being drafted? Much has been made over the past three years about legislation being available for review before voting by the House or Senate. The final draft for review usually is huge PDF that makes it near impossible for citizens, interest groups, and the media to thoroughly analysis in time.
CM: You want to see the sausage being made not just buy the hot dog! I'll comment on the U.S. Congress since that's the system I know best. Thomas is a pretty good system if you happen to be stuck in 1994. It does have all the amendments and the actions and the various stages that legislation go through. But, it isn't real time, more like "pretty quick." As Van Jacobson once quipped, "Same day service in a nanosecond world." And, Thomas isn't really machine processable, it is final form, usually formatted ASCII text (shades of NROFF!). People like Josh Tauberer who built GovTrack.US have spent considerable time crawling those systems and trying to get the data into regularized formats and make it available to others to reuse via APIs, but that isn't the same as exposing the inner working of the sausage factory.
Majority Leader Cantor's staff has been pushing a system to make the raw data all available in XML from the Clerk's office and I think that is a very promising initiative which hopefully will bear fruit. (They're having a February 2 conference to discuss their plans if you are interested. I have no idea if it will be streamed for those of who aren't Inside the Beltway and I don't know their schedule for moving past conferences and into production.)
Congress is a pretty complicated beast. I know some folks like Sean McGrath have had better luck with some of the state legislatures. The problem is you need to dig deep into the inner working of a legislature. In the Congress, that means you're changing things like authoring tools that are used in the Clerk's office and by all the staff members, so you have to be careful or you get a bunch of really angry Congressman yelling at you because their staff can't crank out the flavor-of-the-week in the form of a bill or amendment.
There's also a bit of an issue of will. My work with the Congress to put hearings on-line showed that you could take the official transcripts of a hearing and use those to generate closed captions on the video. All you need is the official transcript of the hearing, but in order to get those I had to execute a special Memorandum of Understanding with the House Oversight Committee. Other committees guard their transcripts jealously and won't let them out for several when. When I started processing a bunch of historical videos we purchased from C-SPAN, I went to the Government Printing Office and found that many committees never deliver their transcripts, even a decade after the fact!
How to keep track of legislative activity about open access?
by oneiros27
Recently in the federal register, there were two calls for comments about access to data and research from federally funded research:
http://federalregister.gov/a/2011-28623 [federalregister.gov] http://federalregister.gov/a/2011-28621 [federalregister.gov]
I didn't hear about these until ~4 weeks after the original announcement, and with the holidays, it was too late to try to get the societies I'm involved with to prepare and vote on official statements. Are there any places where people can get/post notices of these sorts of things so that we can stay informed and try to help influence policies?
CM: The Federal Register is getting a lot better now that it is a much more open system. The idea of "Federal Register 2.0" was a paper I wrote for the Obama transition, so it is an issue I've tracked pretty closely and frankly, I've been amazed at how much better it is now. What they did is instead of selling the raw data feed for the Federal Register for $17,000/year, they went from SGML to XML and then released the data in bulk for free. A few guys out in San Francisco were looking for something to do to enter a contest and they took that bulk data and dreamed up GovPulse.US. That was such a better version of the Federal Register that the Office of the Federal Register switched the official site over to their open source platform. My point is the tools are there to do better notification mechanisms, and I'm sure the government would welcome somebody grabbing the GovPulse.US code out of Github and making it even better.
That's the technical answer. But, the substantive answer is that there is a huge boatload of stuff in the Federal Register and it is pretty hard to figure out what to pay attention to. I also missed that particular call for comment, and I've even missed several Requests for Information coming out of places I try and pay attention to, like the White House's Office of Science and Technology Policy. And, I do this stuff full-time! Perhaps better targeted notification mechanisms are the answer. Maybe it is a social media solution, where you pay attention to things your friends are paying attention to. I hope the answer is not that the only way to pay attention is to be employed with a beltway bandit which can afford hundreds of minions that do nothing but pay attention to Washington. Indeed, there are some very fancy for-pay services from folks like Congressional Quarterly and Bloomberg that cost an arm and a leg, but I can't help but think there has to be a better way that is also open.
What do you think of corporate partnerships?
by mhh5
I'd like to know what you think about corporate partnerships in the process to get public data released. (I'm not sure if Google Patents existed before the USPTO released its databases.) Do corporations that get involved in the process tend to make the process better without question, or are there tradeoffs in some areas because the corporations always want to help but then try to retain a proprietary version of the data for themselves?
CM: The theory is that the government gets some kind of valuable service (like digitization) that the government wouldn't get otherwise so it is a "win-win." But, the reality is all too often the government gets snookered and what we do is give some corporation exclusive access to some pot of data and the government doesn't get much of anything. The deal between Amazon and the National Archives was a good example of that kind of a private fence around the public domain. With a help from Boing Boing, I started systematically purchasing those public domain videos and re-releasing them in the wild. I have no problem with Amazon selling public domain video, I just hate it when they get a de facto or a contractual exclusive. (My testimony before Congress on this subject is here.)
There are lots of other examples of government getting snookered. For example, the Government Accountability Office let Thomson West get access to 60 million or so pages of federal legislative histories. At great cost to the government, they were all packed up and dispatched to West which digitized them all and then sent them back to the government. West now sells access to his amazing database. What did the government get for it's trouble? A few logins for GAO staffers. Even members of Congress need to pay to access the database! (We have an interesting paper trail on this issue.)
I'm glad you brought up the Google Patent system because I was personally involved in making that happen and I can tell you that this one is totally legit. Jon Orwant is the lead developer on this for Google and I played a small part in helping convince the White House and the Patent Office they ought to give Jon access to their data (the heavy lifting on that deal was by Beth Noveck who was the Deputy CTO at the time). Google makes all the data they got from the Patent Office available for bulk access with no strings attached. I can vouch for that because I did a mirror of their system. Last I heard Google was sending out anywhere from 1 to 10 terabytes of data PER DAY to external sources and even normally very critical folks who work in this arena have been really happy.
The big problem in the Patent Office is their computing infrastructure is a real catastrophe. Their power plant is over 95% capacity (e.g., plug in a computer, bring the building down!) and even though the Under Secretary knew that selling DVD subscriptions was silly, he wasn't able to switch over to an FTP service. He cut the deal with Google Patent and it worked out well for the government, for Google, and for everybody else.
What's the difference between the Google deal and the Amazon deal? In the case of the Amazon and GAO/West deals, the government lawyers did all the negotiating and they were totally outsmarted by some sharks in industry. But, when government has people like Under Secretary Kappos and Beth Noveck doing the negotiating, these things can work out just fine. The key is government should partner with people who want to do public service, not people who want to service the public.
Encouraging Governments?
by theNAM666
In a city such as Nashville, things as basic as business ownership and property records are not available online. In states such as New Jersey, public records such as basic corporate filings (officers, operating address/address for service of process) are accessible only for a fee.
What concrete actions can citizens confronting such situations, take to encourage accessibility and accountability?
CM: I find you need a carrot and a stick to make this stuff happen, especially at the local level. Folks like Everyblock.Com and CodeForAmerica.Org have done great working prying some of these databases loose, but there is still lots to do.
The first thing you should do is pick up the phone (or pick up your email client) and write/call the people who run the system. Ask them if you can have access to the data. Sometimes, it is as simple as that.
Other times, though, it isn't quite as simple since they want the money (or they want the control or they think this should be done by "private industry" by which they mean some buddy who is a contractor). The nice thing about any government system is somebody usually has oversight responsibilities. So, the next step is to find a city council member of state legislator who has oversight on the agency in question and ask them.
Again, life isn't usually that simple, but sometimes you win! If you can't get anywhere that way, what I usually end up doing is basically competing with the government system. Build a proxy system like RECAPtheLaw.Org did to recycle paid documents. Or, get a sponsor and buy a reasonable number of docs and build a web site that looks like it is going to be a real production system.
Then, go back again and ask. Maybe if you have eyeballs or at least have a nice web site, that is enough to get the government moving. But, if that doesn't do the job, you may have no choice but to compete with them for real, which of course requires a big commitment in time and energy and not everybody can do that. I know in the case of the Patent Office, I started pestering them in 1993, including several times when I spent 6-figure sums purchasing their data, and it still took until 2011 to crack that nut.
The real trick is focus/obsession. Pick one thing you really care about and just keep pestering them until you crack it open. If you're surfing from one opengov problem to another, showing up for a 1-day hackathon then moving on to something else, you're not going to get anywhere. Pick something real and make it your thing.
Privately Owned, Copyrighted Law
by AdamnSelene
I think I have read that the law itself cannot be copyrighted and it should be possible to make it available available to everyone. But as a techie who drafts standards and specifications, I was wondering about how far this goes--especially since Congress recently proposed enacting some of our standards into law. (They decided not to, but they read some parts into the committee records as they debated.) Can you still accomplish your project if a governmental body adopts (or considers adopting) a privately owned, copyrighted technical reference manual or set of safety standards as administrative law (or regulations that carry the force of law)? Or would such obstacles keep you from being able to digitize all of the government's laws (and archives of proposed laws)?
CM: The idea that the law has no copyright is a fundamental part of the American system of government. That applies to states and municipalities as well. The basic decision is Wheaton v. Peters from 1834 but that decision has been reaffirmed over and over. The law is sacred in the American system. You can't have equal protection under the law or due process under the law if there is a poll tax on access to justice.
When we get to a privately developed standards however, it turns into a very interesting issue. The basic mechanism is called Incorporation by Reference. The government will take some external document (such as a model building code) and incorporate the entire text to make it the law of the land. A guy named Peter Veeck was responsible for a landmark decision in 2002 when he published the Texas Building Code which was an incorporation of a privately-developed and very expensive model code. The court ruled that while the model code had copyright, the law of the land did not.
Based on the Veeck decision, my group went and posted many of the public safety codes enacted by the states. We started by purchasing model codes, finding the incorporating legislation, and concatenating the two pieces together and posting the resulting PDFs. More recently, we've done some extensive reworking of the California public safety codes, known as Title 24, converting the entire text into valid XHTML, recoding the graphics as SVG graphics, the formulas as MathML, and regenerating the PDF documents as nicely typeset documents instead of low-quality scans. You can see this work on the web but it is also available as Google Code project.
The federal government also uses this mechanism intensively, with over 2,000 standards incorporated into the Code of Federal Regulations. This is non-trivial stuff, things like all the OSHA safety regulations. The issue was recently considered by a federal group called the Administrative Conference of the U.S. which basically rolled over and endorsed the idea that it is ok for important parts of the law to cost money. (Read EFF's protest letter if you want a good critique of what they did.)
I'm not necessarily saying that government should be able to appropriate any privately-developed standard and make it available. And, I'm not necessarily saying you want OSHA bureaucrats drafting the standards. But, I do think the big standards establishment and the government regulators have cut a deal that results in the law not being available and the costs forked off on private citizens and small business with extortionate monopoly prices. I just paid $847 for a 48-page safety standard from Underwriters Labs and $60 for 2-page safety standard from the Society of Automotive Engineers, both of which are mandated by law in the CFR. They do need money to run their operations, but let me just point out that in 2009 the 501(c)(3) nonprofit Underwriters Labs paid their CEO $2,138,984 and the nonprofit SAE paid their CEO $412,578.
Ancestry.com
by An Anonymous Reader
What is your opinion about websites like Ancestry.com which make use of public records and charge a subscription fee for access? What is the incentive for the government to migrate old documents into digital form when services like these exist? Do you think Ancestry.com should be a 501(c)(3)?
CM: I'm not a big fan of for-profit corporations that have a business model of monetizing the public domain. I'm fine if they exist and fine if they make billions of dollars, but if they are the only game in town they've taken something that belongs to all of us and and turned it into their private property.
The government got snookered on the Ancestry.Com deal. They could have insisted that the raw data be available in bulk for anybody else to use. The folks that approach the government to cut these sweetheart deals argue that is unreasonable because they need a "return on investment" and the argue that if they don't get the return on investment they won't do the deal (and by extension nobody else will do the deal).
But, government can argue much harder! For example, instead of negotiating some exclusive thing with Ancestry.Com, how come they didn't ask the Internet Archive to grab the data? Or put together something creative with a couple of foundations that would pay for the digitization in return for the kind of payback the foundations like to see (e.g., good press, photo opportunity with the President, or other tools of the trade)?
You asked if Ancestry.Com should be a 501(c)(3)? Not all nonprofits do something that I think which should be an essential part of their mission, which is allow others to compete with them. I believe providing open access to all data ought to be a precondition to getting nonprofit status (an idea that Gil Elbaz has been pushing for quite some time). A good example of a nonprofit that builds walls is Guidestar which wants to be the place where you go for all your nonprofit information. The IRS should be making all Form 990 returns of nonprofits available in bulk for anybody to use, which would knock the bottom out of Guidestar's attempts to build walls and force them to stay innovative and provide value.
Pacer Problems
by onyxruby
How much difficulty do you anticipate in getting and publishing records in Pacer? If there's one system that should be free it the decisions that our courts make and yet you are charged by the page just to view the results. Are you concerned about a court taking an unkind view on your archiving what is in Pacer?
CM: PACER is an abomination. Do they take a dim view of our efforts? Well, the Administrative Office of the U.S. Courts reacted so strongly to our efforts to make their data available that they called the FBI on Aaron Swartz and cancelled the only meaningful public access system they had, which consisted of one terminal in each of 17 public libraries around the country. In this era of rapidly decreasing costs, they just boosted their access charges from 8 cents a page to 10 cents a page, arguing that this is a bargain compared to 25 cents a page for a copy machine.
What I find so disturbing about PACER is that when we did get 20 million pages of docs, we were able to conduct a comprehensive analysis of privacy violations in the courts, an analysis that led to a nice thank-you letter from the Judicial Conference and changes in their privacy rules. In other words, only when public interest groups got access to the data did we begin to address privacy issues. Public access is not just about pro se prisoners defending themselves from a jail cell, which is the view of many in the Administrative Office of the Courts. Public access is about attempts like ours (and many other folks) to make our system of justice function better. When we say we are "an empire of laws not a nation of men" that means we write down what we are doing in our courts so that it is no longer the arbitrary decisions of individuals. The paper trail is there so we can make sure the system is functioning properly. When you limit that access to those that only have a Gold Card, you pervert democracy and you pervert justice.
This principle that access to justice shouldn't hide behind a cash register goes back to the Greeks. Theseus in Euripedes' Suppliants said "when there are no public laws, one man holds power by keeping the law all for himself, and there is no more equality. But when the laws are written, the weak man and the rich man have equal justice." The PACER system is justice for the rich man.
Steve Schultze and the team at Princeton did a lot of the heavy lifting on this issue, including the very nice RECAPtheLaw.Org system they built. They've also done a lot of financial analysis that shows that the courts are not only recovering their costs for operating the expensive PACER system, they're making a huge profit (to the tune of $100 million/year) and using their excess profits to do things like buy big-screen TVs in direct violation of the E-Government act.
The basic problem on PACER is the Judicial Conference has delegated the issue to a few techie judges who think what they've built is something great. But, PACER is a hairball of bad PERL code and the result has not served the judges, the bar, or the American people very well. My only hope is that eventually, the Judicial Conference will see that their information technology is 30 years behind the rest of the Internet and feel ashamed at the travesty they have wrought. Until then, we have RECAP.
If you're interested in the issue, a couple of resources to look at are the PACER paper trail and a bit of a rant that I delivered at the Gov 2.0 summit.
How to visualize opened data?
by hardwarejunkie9
The amount of information you're trying to free is entirely staggering and consists, largely, of tables of numbers. These numbers are incredibly significant, but people generally can't see them.
After you free all of this information and make it available to the public (as it should be), then what? What do you expect for the public to do with these numbers? Tables of information are not nearly as useful as graphs. This data needs to be seen, but, more importantly, it needs to be understood.
Do you have any ideas for how to disseminate this information? Perhaps a team-up with someone like gapminder.org's Hans Rosling might be particularly valuable for all of us.
CM: Actually, most of the data I'm looking at is not tables of numbers, it is video, images, textual documents, technical papers, maps, and books.
But, I definitely get what you're saying and there are a lot of numbers. For example, the IRS Form 990s should be structured data instead of PDF documents, so extracting the data from the mass of paper is the initial challenge. There are lots of other examples of this kind of initial extraction, getting what were printed paper docs into structured data. There are some interesting tools, such as OCRopus which does layout analysis, but there needs to be much more. One of the reason we called for a Federal Scanning Commission is that we think there is a lot of directed R&D that could not only scale up mass digitization but could also work on the important value-added of extraction of structured data and handling some of the tricky issues like detecting the presence of Social Security Numbers.
Once you have the data, as you say, then what? I'm a big fan of the idea that the government starts by providing bulk data, then they provide an API, and then maybe they also build web sites and apps and other things along with everybody else out there. That's a 3-part hierarchy that Ed Felten and some of his students developed and it should be a law that applies to all government information systems that are externally facing.
The issue here is that all too often people look at a problem like "digitize all government information" and they want to see the whole stack of the solution from one place. But, I think you can do a layered approach and count on the fact that there is always somebody smarter out there and our job is to reduce the barriers to entry. So, how would I visualize the data? I have no idea, but I'd make damned sure that folks like Martin Wattenberg at Many Eyes and Hans Rosling at Gapminder knew the data was out there and then I'd sit back and be amazed at whatever they come up with. How's that for pushing the problem downstream?
Why is data access so hard?
by CanHasDIY
Can you provide any explanation as to why it is so difficult and cost-prohibitive to obtain records from the government, especially considering the abundance of laws requiring government compliance with requests for information (AKA "Sunshine Laws")?
Is it simply a matter of government employee ineptitude, or have you found evidence of a more nefarious rationale?
CM: I get that question a lot. Why would a member of Congress take deliberate steps to stop public hearings from being available? Why would a court administrator deliberately restrict access to public court documents? Usually the answer is, as Heinlein said, "you have attributed conditions to villainy that simply result from stupidity." When I'm explaining why something is so broken on a big government system, my usual answer is that there are a lot of people still stuck in the 1970s and 1980s, when information dissemination was really, really hard and it took men in white lab coats and computers the size of freight trains to process data. In other words, the problem with a lot of folks who are government gatekeepers is they just don't get the Internet and they don't get computers. In fact, usually when some senior bureaucrat is throwing stones at me, you can find younger staffers working for them rolling their eyes.
That's an optimistic view, and if I'm right things will get better. But, I'm often wrong on my predictions of the future. (I was the guy who saw TimBL demo the web in 1992 and thought to myself "interesting, but it won't scale.")
But, there is also some more nefarious stuff happening, often the accumulation of power by being able to cut exclusive deals with contractor buddies. If your life in government consists of receiving emissaries from Lockheed Martin, maybe you think you're making everybody happy by letting them build you a $1 billion computer system. Often, you think your problems are so unique that the $1 billion solution is the only answer.
And, in some cases, as we've seen from numerous GAO reports, Inspector General reports, Congressional hearings, and newspaper articles, there are some really evil people out there who think the public domain and the government is their personal business opportunity. Looting the federal government is the kind of civic crime that ranks right up there in my book with stealing cookies from Girl Scouts and selling fake medicines to sick people.
Who is the worst?
by TheBrez
Which government agency is the worst to get information from?
CM: I don't know who the worst are (there's a lot of competition for that slot), but the ones that piss me off the most are the ones that should know better.
Public.Resource.Org is a really small operation. I'm the only staff member. My part-time sysadmin is @mdkail who is pretty busy with his day job as CIO at NetFlix. My ISP is Jim Martin and his team at ISC who are kind of busy running the F-Root. My office net is supported by the amazing systems team at O'Reilly which rents me office space at below-market rates.
I'll grant you government would have a tough time getting that kind of help. But, I'm a one-man shop and we run the 4th most popular U.S. government video channel on YouTube, we're the source for a lot of the on-line presence of the U.S. Court of Appeals, and we've supported efforts for the U.S. Congress, the White House, and the National Archives. If we can do this out of Northern California, couldn't the vast resources of the federal government in Washington, D.C. do a whole lot better than they're doing now?
For me, my current bete noir is the U.S. Congress. We got half-way through processing their archives of video from congressional hearings, publishing about 31 terabytes of data. Then, a couple of staffers decided this was a bad idea and pulled the rug out from under us. They actually decided it was a bad idea to publish video from public congressional hearings.
Like any agency, Congress is a mixed bag. We had tons of support from Darrell Issa, for example, and ran a very successful pilot project for him for a year. We talked to all sorts of people on committees and in the various agencies that support the Congress. But, at the end of the day, a couple of staff members were able to decide that the public archive shouldn't be public and they terminated our project. (If you have some time, you might like to read our rather surreal paper trail.)
So, rather than the worst, I think we need to look for the most shameful, the ones that have the privilege and the power and could easily do better. I know it is in vogue to throw stones at government in general and Washington in particular, but there are times when government can be so useful and so awe inspiring it takes your breath away. Government can be that shining city on the hill but we all have to take an active part in our government to keep those lights shining bright. -
IBM Snags Patent On Half-Day Off of Work Notifications
theodp writes "The USPTO appears to have lowered the bar on obviousness, awarding a patent to IBM Tuesday for its System for Portion of a Day Out of Office Notification. 'Out of office features in existing applications such as Lotus Notes, IBM Workplace, and Microsoft Outlook all implement a way to take a number of days off from one day to many days,' acknowledges purported patent reformer Big Blue. 'Yet, none of these applications contain the feature of letting a person take a half-day or in more general terms, x days and x hours off.' Eureka! And yes, the invention is every bit as obvious as you can imagine." -
IBM Granted Your-Paychecks-Are-What-You-Eat Patent
theodp writes "On IBM's Smarter Planet, at least as envisioned in Big Blue's recently-granted patent for 'providing consumers with incentives for healthy eating habits', the FDA will team up with employers and insurers to determine your final paycheck based upon what you eat. IBM explains that whether a given food item is considered healthy may vary based on a number of factors, including 'individual health histories, family health histories, food intake, exercise routines, medications, and other health related factors', and may even be time dependent ('incentives are greater for consumption of a particular food item during a designated lunch time and less for consumption of the particular food item during other periods of time'). Before being issued, IBM's patent request languished for ten years and was only granted after a Patent Examiner's rejection was overturned on appeal. IBM CEO Sam Palmisano has been a cheerleader for pay-for-monitored-healthy-eating on a national level, which seems to be neatly aligned with the goals of his fellow CEOs on the Business Rountable, who told President Obama in 2009, 'It's very important that we don't have a government [healthcare] plan competing with a private plan and finding out that our employees or the citizens in general could go to a plan that doesn't have the same incentives and requirements and behavioral characteristics to make sure that they do the right things long term'." -
IBM Releases Open Source EGL Development Tools
New submitter dd1968 writes "Today IBM announced the release of a new set of Open Source development tools based on their EGL programming language. The announcement describes the tools as being built from the ground up on an 'open, extensible compiler and generator framework.' The one-language approach places an abstraction layer between the developer and target languages, frameworks, and runtime platforms." -
Windows OS Coming To the Mainframe
msmoriarty writes "Following up on its May announcement, IBM has now confirmed that by December 16 it will support Microsoft Windows on zEnterprise via its zBX component." -
Windows OS Coming To the Mainframe
msmoriarty writes "Following up on its May announcement, IBM has now confirmed that by December 16 it will support Microsoft Windows on zEnterprise via its zBX component." -
Is That an Android On Your Wrist?
DeviceGuru writes "Two startups are about to go chrono y chrono with competing Android gizmos. The I'm Watch exclusively targets smartwatch applications, whereas the WIMM Platform is meant to create 'a new market of connected wearable devices that deliver timely, relevant information at a glance' — of which smartwatches are but one example. The Italian-designed I'm Watch runs a customized Android 1.6 on a 454 MHz ARM9 processor with just 64MB of RAM; the WIMM module, a product of Silicon Valley, runs Android 2.1 on a 667 MHz ARM11 CPU. Would you actually wear one of these things?" Personally, I'd rather have an IBM watch running Linux. -
Accent Monitoring: Innovation Or Rights Violation?
theodp writes "After almost a decade of sending monitors to classrooms across the state to check on teachers' articulation, the NY Times' Marc Lacey reports that a federal investigation of possible civil rights violations has prompted Arizona to call off its accent police. The teachers who were found to have strong accents were not fired, but their school districts were required to work with them to improve their speech. Interestingly, one person's civil rights violation is another's 'wonderful little phenomenon', which is how PBS described the accent neutralization classes attended by Bangalore call center workers who worked for the likes of IBM and Microsoft. On its website, IBM Daksh notes that 'To make sure that customers all over the world can understand the way our people speak, every new hire is trained in what we call voice and accent neutralization.' So, is accent monitoring and neutralization a civil right violation, as the U.S. Depts. of Justice and Education suggest, or is it an 'innovation', as IBM argues?" -
IBM Seeks Patent On Retailer-Rigged Driving Routes
theodp writes "On IBM's Smarter Planet, you may drive further than need be to get to your destination. Big Blue's pending patent for Determining Travel Routes by Using Fee-Based Location Preferences calls for the likes of Walmart, Starbucks, and Best Buy pay a fee in return for having your route calculation service de-optimize driving instructions to make you do a drive-by of their stores, and an additional fee if GPS tracking of your car indicates you actually took the suboptimal route. The same IBM inventors also have a patent pending for Environmental Stewardship Based on Driving Behavior, which calls for yet another fee to be assessed when a retailer-friendly-but-suboptimal route causes your vehicle to enter a congested area and produce more pollution." -
IBM Chief: All CEOs Reluctant To Invest In R&D
theodp writes "In his Centennial Conversation at the Computer History Museum, IBM CEO Sam Palmisano emphasized the importance of investing in R&D, even in a down economy. 'Shareholder expectations for higher returns don't diminish when the economy stutters,' said Sam. 'And yet, Tom Watson Sr. actually increased research investment during the Great Depression.' Palmisano added, 'I will tell you that my own instinctive reflex isn't to continue investing $6 billion a year during the worst economic downturn since the Great Depression. In that regard, I'm like all CEOs.'" -
Jeopardy-Playing Supercomputer Beats Humans
An anonymous reader writes "Ok, this was just a practice round. But in a short demonstration today IBM's Jeopardy-playing supercomputer, a whiz by the name of Watson, thoroughly bested two talented human contestants. IBM has been working on this artificial intelligence project for years to prove that a computer can be programmed to understand conversational speech and wordplay. In today's demo, Watson seems to have proved the point: it started out on a roll in the category 'Chicks Dig Me,' about women and archaeology. The real man versus machine face-off (in which the same contestants compete for a $1 million prize) will be taped tomorrow, and aired in February." -
Judging You By the Online Company You Keep
theodp writes "Network analysis uses data about your social network interactions to make assumptions and predictions about your behavior. The Economist notes the upside for companies looking to sell products. But don't forget about the downside, warns Adrian Chen, of living in a world where network analysis is used by financial firms to determine risky borrowers by looking at social ties, or by Internet businesses to determine which customers are more equal than others (nice to see Microsoft's back on the forefront of some tech!). So, did Mom envision Social Network Analytics when she gave you that you-are-the-company-you-keep lecture?" -
IBM's Question-Answering System "Watson" Revisited
religious freak writes "IBM has created and made the question answering algorithm, Watson, available online. Watson has competed in and won a majority of (mock) matches against humans in Jeopardy. Watson does not connect to the Internet to answer his questions, but rather seeks answers using many different algorithms then employs a ranking algorithm to choose the best answer." We mentioned Watson last year as well. -
New Linux Petabyte-Scale Distributed File System
An anonymous reader writes "A recent addition to Linux's impressive selection of file systems is Ceph, a distributed file system that incorporates replication and fault tolerance while maintaining POSIX compatibility. Explore the architecture of Ceph and learn how it provides fault tolerance and simplifies the management of massive amounts of data." -
Anatomy of Linux Kernel Shared Memory
An anonymous reader sends in an IBM DeveloperWorks backgrounder on Kernel Shared Memory in the 2.6.32 Linux kernel. KSM allows the hypervisor to increase the number of concurrent virtual machines by consolidating identical memory pages. The article covers the ideas behind KSM (such as storage de-duplication), its implementation, and how you manage it. -
IBM Stops Disclosing US Headcount Data
theodp writes "ComputerWorld reports that IBM has stopped providing breakouts on US employees, closing a door to data that provided insights into the bellwether company's employment shift. In its latest Annual Report, Big Blue only provides its global headcount, and an IBM spokesman confirmed that disclosure of US headcount is a thing of the past. The Rochester Institute of Technology's Ron Hira called the US workforce data critical for policymakers trying to understand the dynamics of offshoring. 'By hiding its offshoring, IBM is doing a disservice to America — through omission the company is providing misleading labor market signals and information to policy makers,' Hira said. Ironically, CEO Sam Palmisano's Letter to Shareholders, which accompanied the Annual Report, touts how IBM's Analytics and 'Smarter Planet' efforts are empowering US government decision-makers. Nondisclosure domestically and abroad seems to be the new rule of thumb for Big Tech, sparking calls for government intervention." IBM laid off about 10,000 US workers last year, and 2,900 so far this year, according to the Alliance@IBM, a labor union. -
IBM Stops Disclosing US Headcount Data
theodp writes "ComputerWorld reports that IBM has stopped providing breakouts on US employees, closing a door to data that provided insights into the bellwether company's employment shift. In its latest Annual Report, Big Blue only provides its global headcount, and an IBM spokesman confirmed that disclosure of US headcount is a thing of the past. The Rochester Institute of Technology's Ron Hira called the US workforce data critical for policymakers trying to understand the dynamics of offshoring. 'By hiding its offshoring, IBM is doing a disservice to America — through omission the company is providing misleading labor market signals and information to policy makers,' Hira said. Ironically, CEO Sam Palmisano's Letter to Shareholders, which accompanied the Annual Report, touts how IBM's Analytics and 'Smarter Planet' efforts are empowering US government decision-makers. Nondisclosure domestically and abroad seems to be the new rule of thumb for Big Tech, sparking calls for government intervention." IBM laid off about 10,000 US workers last year, and 2,900 so far this year, according to the Alliance@IBM, a labor union. -
Arrested IBM Exec Goes MIA On the Web
theodp writes "Among those charged in the largest hedge-fund insider trading case in US history was IBM Sr. VP Robert W. Moffat, the heir apparent to IBM CEO Sam Palmisano and the guy behind Big Blue's 'workforce rebalancing' and the sale of IBM's PC unit to Lenovo. IBM's not talking about the incident, but it's interesting that Moffat's bio is MIA at IBM.com ('Biography you tried to access does not exist.'), and his Smarter Planet video can no longer be found ('This video has been removed by the user.') at IBM's YouTube Channel. Do you need approval from the Feds before tidying up after someone who's under investigation? BTW, if stories and comments appearing in the Times Herald-Record and Poughkeepsie Journal are any indication, Moffat may want to avoid a local jury trial. 'I have talked to a few IBMers today, and there seems to be a lot of cheering in the halls of IBM over his arrest,' said Lee Conrad of Alliance@IBM." -
IBM's Patent To "Capture Expert Knowledge" With Games
theodp writes "Robert X. Cringely offers his take on IBM's patent-pending way to suck knowledge out of experts and inject it into younger, stronger, cheaper employees, possibly even in other countries. IBM's 'Platform for Capturing Knowledge' relies on immersive 3-D gaming environments to transfer expert knowledge held by employees 'aged 50 and older' to 18-25 year-old trainees, even those who find manuals 'difficult to read and understand.' It jibes nicely with an IBM White Paper (PDF) that advises CIOs to deal with Baby Boomers by 'investing in global resources from geographies with a lower average age for IT workers, such as India or China.' While Cringely isn't surprised that Big Blue's anyone-can-manage-anything, anyone-should-be-able-to-perform-any-job culture would spawn such an 'invention,' he can't help but wonder: When you get rid of the real experts, who is going to figure out the new stuff?" -
SKA Telescope To Provide a Billion PCs Worth of Processing
Sharky2009 writes "IBM is researching an exaflop machine with the processing power of about one billion PCs. The machine will be used to help process the Exabyte of data per day expected to flow off the Square Kilometre Array (SKA) telescope project. The company is also researching solid state storage technology called 'racetrack memory' which is much faster and denser than flash and may hold the secret to storing the data from the SKA. The story also says that the SKA is unlikely to use grid computing or a cloud-based approach to processing the telescope data due to challenge in transferring so much data (about one thousand million 1Gb memory sticks each day)." -
IBM, Other Multinationals "Detaching" From the US
theodp writes "If you're brilliant, work really hard, and earn a world-class doctorate from a US university, IBM has a job for you at one of its US research sites — as a 'complementary worker' (as this 1996 piece defined the then-emerging term). But be prepared to ship out to India or China after you've soaked up knowledge for 13 months as a 'long-term supplemental worker.' Newsweek sketches some of the bigger picture, reporting that IBM, HP, Accenture, and others are finding it profitable to detach from the United States (even patenting the process). 'IBM is one of the multinationals that propelled America to the apex of its power, and it is now emblematic of the process of creative destruction pushing America to a new, less dominant, and less comfortable position.'" -
IBM Uses Call-Detail Records To Identify "Friends"
theodp writes "Big Blue may know what you did last summer. Or at least who you called. In a move out of the NSA's playbook, IBM Research has been scrutinizing the call-detail records of 'one of the largest mobile operators in the world' (PDF). By analyzing who calls whom, and for how long, IBM claims its patent-pending snooping software can now identify circles of 'friends' who tend to exhibit the same profit-threatening behavior. 'We believe that our analysis is a first of its kind that exploits the underlying social network in a telecom call graph,' boasted a team of IBM researchers and a UMD prof. For now, IBM seems to have focused on using the info to see if your friends are churners, so you can be dealt with pro-actively lest you follow their lead and bolt. However, IBM suggests its SNAzzy data mining technology (Social Network Analysis for Telecom Business Intelligence) has a bright future, noting it 'is also capable of analyzing any kind of social network or graph, not just telecom networks.'" -
POWER7 To Ship In First Half of 2010
BBCWatcher writes "In CPU news, IBM says that its POWER7 servers will start shipping in the first half of 2010, on schedule or perhaps even a few months early if you believe Wikipedia. Moreover, upgrades from a wide variety of POWER6 models will be mere CPU swaps, with the upgraded servers keeping their same serial numbers. (Bean counters like that.) POWER7 sports up to 8 cores per die, 4 threads per core, a clock speed a Hertz or two above 4 GHz, 45 nm process manufacturing, on-chip DDR3, and up to 1,000 micropartitions per machine. IBM claims that POWER7 will offer about 256 Gflops per die and two to three times the performance per watt as POWER6. IBM wants to keep taking orders now for its POWER6 gear (duh), so its sales reps are allegedly ready and eager to deal on 6-cum-7 packages. And it looks like that cunning plan could work rather well given Sun's Rock CPU cancellation and HP's delay of Tukwila Itanium to 2010. (Is anybody still in the server CPU race except IBM, Intel, and maybe AMD?) In 2006, POWER7 won the contest for a DARPA supercomputing R&D grant of $244 million, so you could say that each US citizen is in for about a dollar already." -
IBM Releases Open Source Machine Learning Compiler
sheepweevil writes "IBM just released Milepost GCC, 'the world's first open source machine learning compiler.' The compiler analyses the software and determines which code optimizations will be most effective during compilation using machine learning techniques. Experiments carried out with the compiler achieved an average 18% performance improvement. The compiler is expected to significantly reduce time-to-market of new software, because lengthy manual optimization can now be carried out by the compiler. A new code tuning website has been launched to coincide with the compiler release. The website features collaborative performance tuning and sharing of interesting optimization cases." -
SolarNetOne Wants Stable Internet Connections For Developing Nations
There are many initiatives to bring tech to developing areas of the globe; things like OLPC, Geekcorps, and UN programs. One new approach from SolarNetOne strives to allow users in those developing areas to have access to an internet connection without having to depend on unreliable infrastructure. "Each SolarNetOne kit is a self-powered communications network. Energy is produced from a solar array sized to each locale's latitude and predominant weather conditions. The generated power is stored in a substantial battery array, and circuit breakers and electronics protect the gear from overloads and other perturbations. A basic kit includes five 'seats,' implemented as thin clients connected through a LAN to a central server. The networking gear also includes a long-range, omnidirectional WiFi access point, and a Session Initiation Protocol (SIP) device. Each kit also includes all the cables and wires required to assemble the system, so few additional materials are required for an installation." -
IBM Pushing Water-Cooled Servers, Meeting Resistance
judgecorp writes "IBM has said that water-cooled servers could become the norm in ten years. The company has lately been promoting wider user of the forty-year-old mainframe technology (e.g., here's a piece from April 2008), which allows faster clock speeds and higher processing power. But IBM now says water cooling is greener and more efficient, because it delivers waste heat in a form that's easier to re-use. They estimate that water can be up to 4,000 times more effective in cooling computer systems than air. However, most new data center designs tend to take the opposite approach, running warmer, and using free-air cooling to expend less energy in the first place. For instance, Dutch engineer Imtech sees no need for water cooling in its new multi-story approach which reduces piping and saves waste." -
An Early Look At What's Coming In PHP V6
IndioMan writes "In this article, learn about the new PHP V6 features in detail. Learn how it is easier to use, more secure, and more suitable for internationalization. New PHP V6 features include improved support for Unicode, clean-up of several functions, improved extensions, engine additions, changes to OO functions, and PHP additions." Update — May 7th at 16:47 GMT by SS: IBM seems to have removed the article linked in the summary. Here's a different yet related article about the future of PHP, but it's a year old. -
An Early Look At What's Coming In PHP V6
IndioMan writes "In this article, learn about the new PHP V6 features in detail. Learn how it is easier to use, more secure, and more suitable for internationalization. New PHP V6 features include improved support for Unicode, clean-up of several functions, improved extensions, engine additions, changes to OO functions, and PHP additions." Update — May 7th at 16:47 GMT by SS: IBM seems to have removed the article linked in the summary. Here's a different yet related article about the future of PHP, but it's a year old. -
IBM's But-I-Only-Got-The-Soup Patent
theodp writes "In an Onion-worthy move, the USPTO has decided that IBM inventors deserve a patent for splitting a restaurant bill. Ending an 8+ year battle with the USPTO, self-anointed patent system savior IBM got a less-than-impressed USPTO Examiner's final rejection overruled in June and snagged US Patent No. 7,457,767 Tuesday for its Pay at the Table System. From the patent: 'Though US Pat. No. 5,933,812 to Meyer, et al. discussed previously provides for an entire table of patrons to pay the total bill using a credit card, including the gratuity, it does not provide an ability for the check to be split among the various patrons, and for those individual patrons to then pay their desired portion of the bill. This deficiency is addressed by the present invention.'"