Education of the end user IS the ultimate goal, but not education in computer skills. The XO laptop is a learning tool, and is likened by its creators to a "pencil". Their goal is to give each child in the developing world their own "pencil" to create with. No one will be reconfiguring their kernels on these things.
Well, congratulations to you. You can certainly choose not to believe the book, written by writers from the New York Times, with the collaboration of Vint Cerf and Bob Kahn. Clearly you have superior knowledge to those two individuals.
This is an old wive's tale that deserves to die. The ARPANet was NOT built as an experiment in resiliant networking; it was built by DARPA to connect scientists so they could share all the large computers that DARPA was funding.
Well, honestly, they BOTH matter (it won't be easy for them to get on the internet without the laptop:-).
Negroponte mentioned using refurb desktop PCs to solve the problem. The issue is the sheer labor in refurbing them, getting them qualified, and shipped. And then there's the problem that many households do not have power, and a desktop PC requires SUBSTANTIALLY more power. You really need the package.
It gets on the web, from day one. The laptops self-configure into a mesh, and they are working on deals with indigenous ISPs to provide free internet access for OLPC laptops. Once you've done that, you have the world at your fingertips.
I was at this demo, and got to use the OLPC. Negroponte related an extremely funny anecdote. He described a conversation with a flatscreen vendor that went something like this:
NN: "We need to buy some small, 640x480 LCD screens 6" across. They can have poor color consistency and even a few bad pixels". FSV: "I'm sorry, we're focused on 50" screens with 10,000:1 contrast, perfect color consistency, and no bad pixels". NN: "We need 100,000,000 of them". FSV: ".......oh......."
As a side note, this unit was *fantastic*. Say what you will about the look -- in person, the thing was a work of art. It had the weight of a paperback book.
I figure since you reposted your message, I probably should as well:-)
It is extremely naive to think that you can solve a problem like this by throwing warm bodies at it. Translator salaries ARE very high (100-150k), and you can verify this with 5 minutes of google time. And there still aren't enough - there just aren't that many Arabic lingusits in the US. There are two knock-on problems:
1. Most of the time the government requires linguists that can be cleared at a very high level. Obviously this isn't an issue for this data, but it certainly raises the bar.
2. Translation is a tough job. Good ones make it their lifelong profession. Unfortunately, at this salary rate what you ARE seeing are people with little background trying to become translators. Now, that's actually a good thing long-term, but the quality of the translations they produce usually aren't as good as you would want. I've seen many foreign nationals or 1st/2nd gen immigrants who left the dotcom bomb and have tried to make this move.
but making claims like "the US government has $2BN and can just solve the problem" is extremely specious -- we haven't solved poverty, cancer, drugs, or homelessness that way either. Translating what amounts to an entire country's government's document cache is a pretty hard problem.
I'm not sure what you're calling BS on. Do you NOT believe that they are releasing the documents?
Good arabic translators (say rated 3+ on the Interagency Language Roundtable scale), particularly ones able to acquire a top secret clearance, command salaries of ~$150k+. This means they probably cost the government $300k+.
At any rate, more transparency into these documents is unarguably a good thing.
Hands down, Toughbook, CF-73. Buried one in the dirt in Hawaii at a military exercise last summer, with a DVD. Played the DVD fine the next morning, AFTER DRAINING THE WATER OUT OF IT that accumulated from the dew overnight.
They already DO that. When you sign up you have to check a zkillion boxes saying that you acknowledge that you don't have real 911 service.
Of course, then a few people died, lawsuits ensued, and we wound up where we are now.
Why would you expect it to be any different?
In this case, however, I think it's a good thing. VZ and the other encumbents were playing the "oh, it's HAAAARDDD to open our 911 systems", which has to be a load of horse shit.
Yup, this conference looks like one of those used to buff resumes. If you look at the "Academic and Industry sponsors" page, you will notice that NO major universities or societies are sponsoring this conference. I get a couple invitiations to things like this a month.
I'm not entirely sure by what you mean when you say "...aren't really used to measure search technology" when that is exactly what they ARE used for.
Relevance IS a problem of human judgement, but over a community of humans, it is possible to get an average measure of precision and recall. Utility certianly enters into it, and conventional evaluation approaches don't address things like user interaction with the search methodology. But that doens't make them less valid as a comparative approach.
However, if you feel that Google gives you better results, by all means go for it!
Good point. The IR community has a method to take that into account, I just forget how it's done.
Unfortunately, NONE of the search engines I know of actually show the ranking -- I think they claim they are ordered (whichi is good), but how big is the step between 1 and 2, or 2 and 3? No one knows!
This is why you have to generate a DET curve by measuring P and R at several points.
But the point remains: in most practical applications of open-ended search, precision is the only game in town. Given the two cases you describe, the 100% precision result is certainly better!
Unless you know for sure all the places that your somewhat rare name is mentioned, and that the search engine you're working with has visited the site, this won't work.
Unfortunately, comparing search engines is a nearly impossible task, since they probably aren't indexing the same data.
When you measure a search technology, the values you typically look for are precision and recall. precision says "of the X results you gave me, how many of them are relevant". recall says "in the world, there were Y possible pages you could have found, but you gave me X of them".
you can't measure recall for a public search engine, but you can measure precision. Take a set of sample queries, and some users. Have them perform the queries, and go through the first ~100 pages and give them a "thumbs up" (relevant) or "thumbs down" (not relevant).
Your overall score will measure precision: if at N=100, all 100 were relevant, that's 1.0. if only 50 were judged relevant, precision is 0.5.
You can estimate recall by judging say 1,000 documents (phew). Then sample precision at N=10, 100, 500, etc, assuming that is an "exhaustive" list of documents in the world.
On United, channel 9 on the in-flight entertainment system is usually a patch-in to ATC, at least on domestic (US) flights. There may be a few other airlines that do this as well.
No kidding. By some reckoning, it's taken them what, 3-5 years to get where they are? When was the Mozilla release party? Ah yes, June 2002. And it was open-sourced in *1999*.
Yup, this is definitely not what I would call "video search" -- it's "searching for videos". You don't get any apparent access to the actual content of the video.
Not sure what you're trying to imply. Open Source intelligence predates open source software by probably 30 years.
Education of the end user IS the ultimate goal, but not education in computer skills. The XO laptop is a learning tool, and is likened by its creators to a "pencil". Their goal is to give each child in the developing world their own "pencil" to create with. No one will be reconfiguring their kernels on these things.
Well, congratulations to you. You can certainly choose not to believe the book, written by writers from the New York Times, with the collaboration of Vint Cerf and Bob Kahn. Clearly you have superior knowledge to those two individuals.
This is an old wive's tale that deserves to die. The ARPANet was NOT built as an experiment in resiliant networking; it was built by DARPA to connect scientists so they could share all the large computers that DARPA was funding.
e rnet/dp/0684832674
See: Where Wizards Stay Up Late
http://www.amazon.com/Where-Wizards-Stay-Late-Int
and
http://www.businessweek.com/1996/38/b349359.htm
Well, honestly, they BOTH matter (it won't be easy for them to get on the internet without the laptop :-).
Negroponte mentioned using refurb desktop PCs to solve the problem. The issue is the sheer labor in refurbing them, getting them qualified, and shipped. And then there's the problem that many households do not have power, and a desktop PC requires SUBSTANTIALLY more power. You really need the package.
It gets on the web, from day one. The laptops self-configure into a mesh, and they are working on deals with indigenous ISPs to provide free internet access for OLPC laptops. Once you've done that, you have the world at your fingertips.
I was at this demo, and got to use the OLPC. Negroponte related an extremely funny anecdote. He described a conversation with a flatscreen vendor that went something like this:
NN: "We need to buy some small, 640x480 LCD screens 6" across. They can have poor color consistency and even a few bad pixels".
FSV: "I'm sorry, we're focused on 50" screens with 10,000:1 contrast, perfect color consistency, and no bad pixels".
NN: "We need 100,000,000 of them".
FSV: ".......oh......."
As a side note, this unit was *fantastic*. Say what you will about the look -- in person, the thing was a work of art. It had the weight of a paperback book.
I figure since you reposted your message, I probably should as well :-)
It is extremely naive to think that you can solve a problem like this by throwing warm bodies at it. Translator salaries ARE very high (100-150k), and you can verify this with 5 minutes of google time. And there still aren't enough - there just aren't that many Arabic lingusits in the US. There are two knock-on problems:
1. Most of the time the government requires linguists that can be cleared at a very high level. Obviously this isn't an issue for this data, but it certainly raises the bar.
2. Translation is a tough job. Good ones make it their lifelong profession. Unfortunately, at this salary rate what you ARE seeing are people with little background trying to become translators. Now, that's actually a good thing long-term, but the quality of the translations they produce usually aren't as good as you would want. I've seen many foreign nationals or 1st/2nd gen immigrants who left the dotcom bomb and have tried to make this move.
but making claims like "the US government has $2BN and can just solve the problem" is extremely specious -- we haven't solved poverty, cancer, drugs, or homelessness that way either. Translating what amounts to an entire country's government's document cache is a pretty hard problem.
I'm not sure what you're calling BS on. Do you NOT believe that they are releasing the documents?
Good arabic translators (say rated 3+ on the Interagency Language Roundtable scale), particularly ones able to acquire a top secret clearance, command salaries of ~$150k+. This means they probably cost the government $300k+.
At any rate, more transparency into these documents is unarguably a good thing.
Hands down, Toughbook, CF-73. Buried one in the dirt in Hawaii at a military exercise last summer, with a DVD. Played the DVD fine the next morning, AFTER DRAINING THE WATER OUT OF IT that accumulated from the dew overnight.
...is here:
h tml
http://www.ic-arda.org/InfoExploit/aquaint/index.
They already DO that. When you sign up you have to check a zkillion boxes saying that you acknowledge that you don't have real 911 service.
Of course, then a few people died, lawsuits ensued, and we wound up where we are now.
Why would you expect it to be any different?
In this case, however, I think it's a good thing. VZ and the other encumbents were playing the "oh, it's HAAAARDDD to open our 911 systems", which has to be a load of horse shit.
Yea, it's basically like that. Real conferences don't accept unreviewed papers at all, so that's a telltale sign.
Yup, this conference looks like one of those used to buff resumes. If you look at the "Academic and Industry sponsors" page, you will notice that NO major universities or societies are sponsoring this conference. I get a couple invitiations to things like this a month.
I'm not entirely sure by what you mean when you say "...aren't really used to measure search technology" when that is exactly what they ARE used for.
Relevance IS a problem of human judgement, but over a community of humans, it is possible to get an average measure of precision and recall. Utility certianly enters into it, and conventional evaluation approaches don't address things like user interaction with the search methodology. But that doens't make them less valid as a comparative approach.
However, if you feel that Google gives you better results, by all means go for it!
Good point. The IR community has a method to take that into account, I just forget how it's done.
Unfortunately, NONE of the search engines I know of actually show the ranking -- I think they claim they are ordered (whichi is good), but how big is the step between 1 and 2, or 2 and 3? No one knows!
This is why you have to generate a DET curve by measuring P and R at several points.
But the point remains: in most practical applications of open-ended search, precision is the only game in town. Given the two cases you describe, the 100% precision result is certainly better!
So you are saying that there is no yardstick by which you can compare search engines? I don't believe that to be true. It's just hard.
If there isn't, how are you going to answer your question #1 -- gut feeling? By missing out - do you mean parts of the web? usability features?
As I said in an earlier post -- it's nearly impossible. But that doesn't mean you can't come up with a reusable metric to make an objective judgement.
Just take the first 1,000 that come out of the search engine when you give a query!
Unless you know for sure all the places that your somewhat rare name is mentioned, and that the search engine you're working with has visited the site, this won't work.
Unfortunately, comparing search engines is a nearly impossible task, since they probably aren't indexing the same data.
When you measure a search technology, the values you typically look for are precision and recall. precision says "of the X results you gave me, how many of them are relevant". recall says "in the world, there were Y possible pages you could have found, but you gave me X of them".
you can't measure recall for a public search engine, but you can measure precision. Take a set of sample queries, and some users. Have them perform the queries, and go through the first ~100 pages and give them a "thumbs up" (relevant) or "thumbs down" (not relevant).
Your overall score will measure precision: if at N=100, all 100 were relevant, that's 1.0. if only 50 were judged relevant, precision is 0.5.
You can estimate recall by judging say 1,000 documents (phew). Then sample precision at N=10, 100, 500, etc, assuming that is an "exhaustive" list of documents in the world.
On United, channel 9 on the in-flight entertainment system is usually a patch-in to ATC, at least on domestic (US) flights. There may be a few other airlines that do this as well.
Sorry, no such luck :-) The 2-day shipping deal is limited to products sold directly by Amazon.com, and the Badonkadonk ain't.
No kidding. By some reckoning, it's taken them what, 3-5 years to get where they are? When was the Mozilla release party? Ah yes, June 2002. And it was open-sourced in *1999*.
Yup, this is definitely not what I would call "video search" -- it's "searching for videos". You don't get any apparent access to the actual content of the video.