It doesn't know for sure - but it asks multiple people the same CAPTCHA and if enough people (as a percentage) give the same answer, then it gets 'approved' and stored as the correct answer for the word.
That sure is a complicated way of generating a difficult image of letters to OCR and using for verification, which is all that is needed. Already done by other CAPTCHAs with letters that are squished together and in various pastel colors, the example given in TFA used by Microsoft was just trivial in that the letters were well spaced apart and easily identified from background is the only problem.
As computers get more powerful and AI gets better, CAPTCHAs have to get harder or they are broken.
Good point. The example Windows Live Captcha example is trivial in that the letters are spaced well apart, not even overlapping with something like the cross bar of a T over the bottom of something like an A, although since I have developed code in a personal OCR project twenty years ago to separate letters with any gaps snaking between them, I'm sure it's not a problem for these guys.
What's needed are generating letters that overlap significantly. That is where a human can differentiate and these guys OCR won't pass that Turing test by the time a new email protocol is invented.
I enter in CAPTCHA's on a site or two that doesn't overlap significantly, but shoves them together and overlaps slightly. That's all that's needed for now. Just so it's not this easy.
Botnets. If someone really wanted to make 10,000 accounts, just have each computer on a botnet make 1 account each, with a botnet of 10,000 computers. Different IPs, etc to make them difficult to differentiate from legitimate creations.
As it uses text digitized from old books that the best OCR technology couldn't read, it's continually different and already demonstrated to be unintelligible to machines.
How does the ReCAPTCHA check know whether the right answer was given?
Hrm. I think it more likely we'd sell as much of it as possible to developing countries that are thirsting for it (like China) in order to pay down our massive debts to foreign countries that we're racking up...
A big part of our massive debt is from importing oil. Sell it? Energy independence means hoping we can produce enough without needing to import it. We used to sell it. That was several dry Texas oilfields and millions of SUV's ago.
And then there's the conspiracy theorist in me who wonders if they aren't purposely driving hte price of oil up in order to make exploiting domestic oil that much more realistic, and thus wean us off the foreign teat...
I assure you we have no influence on the price of oil, which Bush found out when he went over recently and asked the Saudis to lower the cost of oil, not raise it.
The cost per barrel of oil from various sources like shale oil and tar sands and coal liquefication is somehow always more expensive than buying sweet crude, at least the estimates rise as fast as the OPEC suggested list. Why? I dunno, but I'm sure it has something to do with giant oil companies who sell imported crude making the estimates.
To be fair, this was a surprise action by Amazon. Lots of people are hosed. Lots of companies are hosed, the companies as you can imagine hosed even more than you are.
A couple of mitigating factors: Any POD publisher can list as a third party seller on Amazon, there are costs and margin tradeoffs. I don't know what the POD's are going to do about Amazon.
The other is that there is next to zero chance that a person will ever look at an Amazon web page for a book without being pointed there from elsewhere. That includes any book published however without something to make people aware of it. So if you had done your deal a year ago and was on Amazon for a year there would still be no noticeable difference at this point than not being there to start with, unless you were able to get potential buyers to your book page.
why spend your capital to buy niche products that may sit in the warehouse for months when you can rent website space to let someone else take the risk. this way you can spend your capital to buy up top 40 CD's and bestseller books that will sell more copies.
Amazon didn't carry inventory of POD books, that's why they're called Print On Demand. They get printed when there's a retail purchase.
Major wholesalers do carry some minor POD stock so they can ship overnight, but Amazon just has them do the shipping out of inventory or printing as needed.
That all changes now with Amazon's ultimatums, and your argument is now reversed. As the only wholesaler/retailer for all POD books sold on Amazon, they actually have to carry POD inventory now to be able to ship overnight, but it's their Booksurge inventory.
They have incurred POD inventory risk, not eliminated it.
The gist is that IBM protested an $84 million financial modernization project awarded to some other consultant, CGI, and the GAO had still not made a decision, although the research firm Fed Sources earlier reported the GAO upheld IBM's complaint.
The EPA last week temporarily banned IBM from receiving any new government contracts while it investigates "potential activities involving a procurement."
Sounds like the EPA believes IBM was interfering with the GAO decisionmakers to overrule the EPA and award the contract to IBM.
So they apparently decided to play some hardball back.
I would say this about any book from any publisher. That's why I read a sample chapter if available from publsher and buyer reviews on Amazon before I buy.
Now obviously, POD publishers print anything a person pays to have printed, so it could be lunatic ravings as far as that is concerned. But no one will ever see the book unless they're browsing through online book sites with some kind of search that happens to bring it up, and purchases based on becoming enamored with the title.
Or they could have their own POD option with which you can legally acquire a copy of the ebook in print form.
Most of the POD publishers have this. The e-book is available for sale as another option. But Amazon already dropped other e-book sales after they bought their own e-book publisher.
.. there are plenty of book topics that might have a large customer base (large enough to make money anyways) that no publisher would dare buy no matter how good the manuscript...
Yes, as everyone knows from tv nowadays, there is incredible interest in true crime but publishers will not take on true crime books about a crime unless and until there is a verdict, with very few exceptions.
Self-publishing POD would be the only way to get your true crime book published.
I suspect Amazon is simply doing this as a way to stay away from business that creates lots of hassles and no significant profit.
It's incredible that this post was modded informative. Amazon bought their own POD publisher and printer Booksurge in 2005 and is now in the process of requiring that POD's sold by Amazon be printed by their subsidiary. There is also a change to percentage requirements of the sale on Amazon that further benefits Amazon.
In addition, according to TFA, they would not be in Ingram distribution, wholesaler to the bookstores, as Lightning Source is. That is beyond belief that anyone would accept that to be able to be sold by Amazon. If forced to make a choice like that, it won't be Amazon.
Seems like most of the low-hanging fruit is in frameworks for connecting business rules to a database with the usual CRUD operations.
That might be a philosophical goal, but it sure as heck isn't software redundantly reimplemented in a common form at numerous companies and therefore infrastructure that everyone wastes time reinventing when they could pool the efforts and reuse.
I've rarely seen a framework that would perform an IO such as an insert or update of data based on configurable business rules that wasn't a commercial product. It certainly isn't a common home grown solution in companies, as Red Hat is speaking of.
This is what I mean by no one being able to even start a list of these alleged billions of dollars of reinvented software by numerous companies that performs common non-proprietary functionality.
I would contend that anything anyone could come up with has been started as one or more open source projects. If there was potential for what Red Hat contends, the open source infrastructure already provides the opportunity to do it.
I can guarantee you that the business rule driven framework to drive CRUD operations is not remotely in the realm of corporate software development. I have no doubt that several entrepreneur types would like to perfect a system along these lines and offer as a product, open source or otherwise, but it certainly isn't something which companies have reinvented in house.
It doesn't cost that much, Harris is simply trying to pull shenanigans and get some free money for a project they plan on failing already. This is the status quo for government projects.
This is totally offbase and clueless. Even a casual reading on this subject would show that the original specs and number of changes the gov made to the project (in the hundreds) is the result of big problems, but one of them isn't Harris trying to get free money.
Here in the UK we've had several high profile massive budget IT failures in the last 10 years, air traffic control, national health patient record databases, in fact the more critical it is the more of a spectacular unqualified fuck-up it becomes.
Same here in the US. I contend it's because new software is web and SQL based whereas former large systems were terminal and record level IO based, but IS refuses to see that the Emperor is wearing no clothes.
Paint me blue and call me stupid, but really, how hard is it to make a hand-held computer designed to take and store census data? It's not like these machines need to calculate pi. It's data entry and retention. Right? How could that possibly require $2 billion dollars to implement? What am I missing? (beyond the obvious corruption and inflation of budgets to line the pockets of fat cats)
You wouldn't believe it if you read the specs for the thing. The gov actually is requiring a custom built PDA because they couldn't find anything on the market to perform to these specs. GPS, mapping, communications of entered data, and who knows what else besides data retention. Yet somehow small notebooks didn't cut it.
I personally think it's dumb as hell, especially when all this is to try to count people that didn't respond to mailings and don't want to be counted/found/documented.
It doesn't know for sure - but it asks multiple people the same CAPTCHA and if enough people (as a percentage) give the same answer, then it gets 'approved' and stored as the correct answer for the word.
That sure is a complicated way of generating a difficult image of letters to OCR and using for verification, which is all that is needed. Already done by other CAPTCHAs with letters that are squished together and in various pastel colors, the example given in TFA used by Microsoft was just trivial in that the letters were well spaced apart and easily identified from background is the only problem.
Thanks for the explanation.
rd
As computers get more powerful and AI gets better, CAPTCHAs have to get harder or they are broken.
Good point. The example Windows Live Captcha example is trivial in that the letters are spaced well apart, not even overlapping with something like the cross bar of a T over the bottom of something like an A, although since I have developed code in a personal OCR project twenty years ago to separate letters with any gaps snaking between them, I'm sure it's not a problem for these guys.
What's needed are generating letters that overlap significantly. That is where a human can differentiate and these guys OCR won't pass that Turing test by the time a new email protocol is invented.
I enter in CAPTCHA's on a site or two that doesn't overlap significantly, but shoves them together and overlaps slightly. That's all that's needed for now. Just so it's not this easy.
rd
Botnets. If someone really wanted to make 10,000 accounts, just have each computer on a botnet make 1 account each, with a botnet of 10,000 computers. Different IPs, etc to make them difficult to differentiate from legitimate creations.
They already do this.
rd
As it uses text digitized from old books that the best OCR technology couldn't read, it's continually different and already demonstrated to be unintelligible to machines.
How does the ReCAPTCHA check know whether the right answer was given?
Hrm. I think it more likely we'd sell as much of it as possible to developing countries that are thirsting for it (like China) in order to pay down our massive debts to foreign countries that we're racking up...
A big part of our massive debt is from importing oil. Sell it? Energy independence means hoping we can produce enough without needing to import it. We used to sell it. That was several dry Texas oilfields and millions of SUV's ago.
rd
And then there's the conspiracy theorist in me who wonders if they aren't purposely driving hte price of oil up in order to make exploiting domestic oil that much more realistic, and thus wean us off the foreign teat...
I assure you we have no influence on the price of oil, which Bush found out when he went over recently and asked the Saudis to lower the cost of oil, not raise it.
The cost per barrel of oil from various sources like shale oil and tar sands and coal liquefication is somehow always more expensive than buying sweet crude, at least the estimates rise as fast as the OPEC suggested list. Why? I dunno, but I'm sure it has something to do with giant oil companies who sell imported crude making the estimates.
rd
you're joking, right?
rd
To be fair, this was a surprise action by Amazon. Lots of people are hosed. Lots of companies are hosed, the companies as you can imagine hosed even more than you are.
A couple of mitigating factors: Any POD publisher can list as a third party seller on Amazon, there are costs and margin tradeoffs. I don't know what the POD's are going to do about Amazon.
The other is that there is next to zero chance that a person will ever look at an Amazon web page for a book without being pointed there from elsewhere. That includes any book published however without something to make people aware of it. So if you had done your deal a year ago and was on Amazon for a year there would still be no noticeable difference at this point than not being there to start with, unless you were able to get potential buyers to your book page.
Nothing against you or POD, just true in general.
rd
why spend your capital to buy niche products that may sit in the warehouse for months when you can rent website space to let someone else take the risk. this way you can spend your capital to buy up top 40 CD's and bestseller books that will sell more copies.
Amazon didn't carry inventory of POD books, that's why they're called Print On Demand. They get printed when there's a retail purchase.
Major wholesalers do carry some minor POD stock so they can ship overnight, but Amazon just has them do the shipping out of inventory or printing as needed.
That all changes now with Amazon's ultimatums, and your argument is now reversed. As the only wholesaler/retailer for all POD books sold on Amazon, they actually have to carry POD inventory now to be able to ship overnight, but it's their Booksurge inventory.
They have incurred POD inventory risk, not eliminated it.
rd
Here's the link to CNN's story:
http://money.cnn.com/2008/04/01/technology/bc.na.fin.us.ibm.contract.ap/index.htm?postversion=2008040118
The gist is that IBM protested an $84 million financial modernization project awarded to some other consultant, CGI, and the GAO had still not made a decision, although the research firm Fed Sources earlier reported the GAO upheld IBM's complaint.
The EPA last week temporarily banned IBM from receiving any new government contracts while it investigates "potential activities involving a procurement."
Sounds like the EPA believes IBM was interfering with the GAO decisionmakers to overrule the EPA and award the contract to IBM.
So they apparently decided to play some hardball back.
rd
How can taxpayers benefit from such move?
What, do you work for IBM? Unbelievable reasoning, even for April Fool's Day.
According to your reasoning, Enron should still be kicking.
rd
They apparently missed their payments, and you know the ones I'm talking about.
I learned some of the most important concepts in a class that was all done in pseudocode...You are destined to be a code monkey.
which is preferable to being a pseudocode monkey like you.
rd
What other online booksellers are out there? Particularly booksellers that deal with POD?
Barnes & Noble, Powell's, Books-A-Million
It's very very hard not to make money off of POD if you sell at least one copy.
It's very very hard to sell at least one copy.
rd
...but the quality can be rather unpredictable.
I would say this about any book from any publisher. That's why I read a sample chapter if available from publsher and buyer reviews on Amazon before I buy.
Now obviously, POD publishers print anything a person pays to have printed, so it could be lunatic ravings as far as that is concerned. But no one will ever see the book unless they're browsing through online book sites with some kind of search that happens to bring it up, and purchases based on becoming enamored with the title.
rd
Or they could have their own POD option with which you can legally acquire a copy of the ebook in print form.
Most of the POD publishers have this. The e-book is available for sale as another option. But Amazon already dropped other e-book sales after they bought their own e-book publisher.
Shades of what was to come.
rd
.. there are plenty of book topics that might have a large customer base (large enough to make money anyways) that no publisher would dare buy no matter how good the manuscript...
Yes, as everyone knows from tv nowadays, there is incredible interest in true crime but publishers will not take on true crime books about a crime unless and until there is a verdict, with very few exceptions.
Self-publishing POD would be the only way to get your true crime book published.
rd
I suspect Amazon is simply doing this as a way to stay away from business that creates lots of hassles and no significant profit.
It's incredible that this post was modded informative. Amazon bought their own POD publisher and printer Booksurge in 2005 and is now in the process of requiring that POD's sold by Amazon be printed by their subsidiary. There is also a change to percentage requirements of the sale on Amazon that further benefits Amazon.
In addition, according to TFA, they would not be in Ingram distribution, wholesaler to the bookstores, as Lightning Source is. That is beyond belief that anyone would accept that to be able to be sold by Amazon. If forced to make a choice like that, it won't be Amazon.
rd
where's the imagine a beowulf cluster of these post?
I think there would only be need for about six of these in the entire world.
rd
Seems like most of the low-hanging fruit is in frameworks for connecting business rules to a database with the usual CRUD operations.
That might be a philosophical goal, but it sure as heck isn't software redundantly reimplemented in a common form at numerous companies and therefore infrastructure that everyone wastes time reinventing when they could pool the efforts and reuse.
I've rarely seen a framework that would perform an IO such as an insert or update of data based on configurable business rules that wasn't a commercial product. It certainly isn't a common home grown solution in companies, as Red Hat is speaking of.
This is what I mean by no one being able to even start a list of these alleged billions of dollars of reinvented software by numerous companies that performs common non-proprietary functionality.
I would contend that anything anyone could come up with has been started as one or more open source projects. If there was potential for what Red Hat contends, the open source infrastructure already provides the opportunity to do it.
I can guarantee you that the business rule driven framework to drive CRUD operations is not remotely in the realm of corporate software development. I have no doubt that several entrepreneur types would like to perfect a system along these lines and offer as a product, open source or otherwise, but it certainly isn't something which companies have reinvented in house.
rd
It doesn't cost that much, Harris is simply trying to pull shenanigans and get some free money for a project they plan on failing already. This is the status quo for government projects.
This is totally offbase and clueless. Even a casual reading on this subject would show that the original specs and number of changes the gov made to the project (in the hundreds) is the result of big problems, but one of them isn't Harris trying to get free money.
rd
What are the factors that turn a simple software project into an impossible task?
not being simple.
rd
Here in the UK we've had several high profile massive budget IT failures in the last 10 years, air traffic control, national health patient record databases, in fact the more critical it is the more of a spectacular unqualified fuck-up it becomes.
Same here in the US. I contend it's because new software is web and SQL based whereas former large systems were terminal and record level IO based, but IS refuses to see that the Emperor is wearing no clothes.
rd
Paint me blue and call me stupid, but really, how hard is it to make a hand-held computer designed to take and store census data? It's not like these machines need to calculate pi. It's data entry and retention. Right? How could that possibly require $2 billion dollars to implement? What am I missing? (beyond the obvious corruption and inflation of budgets to line the pockets of fat cats)
You wouldn't believe it if you read the specs for the thing. The gov actually is requiring a custom built PDA because they couldn't find anything on the market to perform to these specs. GPS, mapping, communications of entered data, and who knows what else besides data retention. Yet somehow small notebooks didn't cut it.
I personally think it's dumb as hell, especially when all this is to try to count people that didn't respond to mailings and don't want to be counted/found/documented.
rd