DARPA Seeks the Holy Grail of Search Engines
coondoggie writes "The scientists at DARPA say the current methods of searching the Internet for all manner of information just won't cut it in the future. Today the agency announced a program that would aim to totally revamp Internet search and 'revolutionize the discovery, organization and presentation of search results.' Specifically, the goal of DARPA's Memex program is to develop software that will enable domain-specific indexing of public web content and domain-specific search capabilities. According to the agency the technologies developed in the program will also provide the mechanisms for content discovery, information extraction, information retrieval, user collaboration, and other areas needed to address distributed aggregation, analysis, and presentation of web content."
The Individual Midnight Thread - Farewell
I'm moving with all the great community of smart people and old-timers to http://www.soylentnews.org
Submissions for new names will be accepted whole week. Then we will have a two-stage voting.
See you all great guys there!
Like what Vannevar Bush talked about in the 1940s? That's odd: I've been assured that only space exploration can develop computers and technologies as a "spinoff". How is it possible these ideas predate Sputnik? I think it's a conspiracy.
When they can just google it?
Interesting, but does it run Beta?
Nevermind, forgot, BETA doesn't even run!
The Individual Midnight Thread - Farewell
SoylentNews
QUICK DEATH TO BETA
who is really commissioning and funding this little project? anyone care to guess which three letters are involved?
"Disable Advertising As our way of thanking you for your positive contributions to Slashdot, you are eligible to disable advertising." ORLY? As an audience member, I have not contributed in a long time.
This is my signature.
this capability exists?
FTFY
Beta is soulless destruction of a great internet forum as well as Slashdot as a historical landmark during bleak times of cultish walled gardens and MBA's coding HTML and CSS.
Join ##altslashdot
http://webchat.freenode.net/?channels=##altslashdot.org
... that Snowden didn't have good enough tools available.
This probably has g00GLe scrambling, running around like a chicken with its head cut off screaming, "We are the holy grail of search engines. We are IT." HaHaHa!!
Second - we (common netizens) may welcome the sort of information availability DARPA is seeking - sort of like the scifi future where you just ask the nearest terminal whatever you want to know and magically get the answer you need - but there are lots of bad people still running around on this planet (scamsters, governments, jilted ex-lovers, religious extremists, etc.). The problem isn't the technology, the problem is our ability to handle it.
I very much suspect DARPA may be onto something. I wonder if it will be as beneficial as the WWW has been.
FBI M-O-U-S-E. next best thing to surfing along looking over your shoulder, taking notes, texting the US Marshals Service what they need to subpoena before driving over and clapping on the leg irons.
if this is supposed to be a new economy, how come they still want my old fashioned money?
Slashdot needs to see specialist, radio therapy may work on beta.slashdot but if it fails my recommendation would be to nuke the damn think all together, just to be sure it is not spreading to others.
Alien
Just how many buzzwords can we fit in here, anyhow?
Did you guys notice how often new being posted on ./ after fuck beta comments appear on almost all the post?
It's nice to see that someone implementing damage control team.
So are Dice is giving up on the beta>?
Google already does domain-specific indexing, certain sites get indexed faster or deeper than others based on a number of secret rules.
For site specific search prefix your query with "site:foo.com"
"When information is power, privacy is freedom" - Jah-Wren Ryel
That should probably be Darpa's response to Google buying Boston Dynamics.
answer: google. now, where's my grant money?
So do I... One that is robust and can't be censored by anybody, ever.
“He’s not deformed, he’s just drunk!”
Just search in Google for "holy grail of search engines". Job done.
return 0; }
Still can be improved. Take the following 'search' which is more computational than search-y and something nothing today gets close to being able to provide an answer:
- How often has it rained on Jan 15th, in Seattle WA, over the past 50 years.
The answer being something like 40 out of 50 times it has rained on that day etc. Unless someone has literally written that phrase, or close to it, you will be stuck computing that data yourself. We have computers for this right :)
but that site is even gayer, holy shit, I didn't think something could beat out slashdot in gayness, but congrats sir , you've really outdone the gay this time, fagot
Sometimes, as you read the RFP document you can tell it's been written with a specific proposer in mind, and everybody else gets a short window to come up with an idea + jump through endless list of government hoops.
Freedom! Freedom! Freedom!
Sorry, what with all the negativity around here lately, I couldn't resist.
There is nothing wrong with the they way searching is done today.
This is merely a attempt to constrict the web. If you start indexing search results by TLDs, you effectively create a filter that can be handed over to the powers that be to easily channel all the "bad" hush hush content to /dev/null
The world wide web and its freedom depends on chaos.
http://www.dogpile.com/
metasearch
Insanity: doing the same thing over and over again and expecting different results. Albert Einstein
Maybe I'm wrong but... I'll bet DARPA's winning technology will be the one that redirects all search queries to the NSA. So they can tell us what to think.
Google, Deepmind, etc.
"If any question why we died, Tell them because our fathers lied."
Today unstructured information search findability is limited by the Shannon Limit, this is a fundamental physical limit since all pattern search engines are statistical decoders. Google does a little better than the Shannon limit by looking at which search results are selected, this a communal intelligence technique based on how we "vote" for the right result. Unfortunately this only works well for high volume searches, that's why Google work's best if you know exactly what you're looking for or you're looking for what everyone else is looking for. In commercial search Google is losing out to companies like Amazon that are investing in editorial findability enhancements using lots of work by folks on Mechanical Turk.
EBay is actually doing some of the most interesting work in findability research, that's because they have the 'everything' search problem, this is harder than the popular search problem the Google is mostly concerned with now. Google seems to have given up on technical, scientific, commercial, practical, professional, industrial, and other sorts of specialized search. This sort of information usually has a very low Shannon Limit, that's why professional search usually uses extensive manual indexing such as that provided by Westlaw.
The holy grail of machine translation is automatically extracting the exposition structure (rigor, rationale, rhetoric) from texts, but nearly no progress has been made so far despite decades of research, and here again the problem is the Shannon Limit. Presumably this is the problem that MEMEX must solve in order to succeed, but this can't be solved by machine intelligence alone,
that's a physical impossibility, so even fashionable techniques such as deep learning are out the window.
In the publishing world there are techniques that turn exposition structures into texts, this is authoring automation. This could be used to generate sample texts for some target search, and machine intelligence can be used to score matches between generated samples and search texts. In this way it might be possible to automate educated guessing of the exposition structure, and it partially gets around the Shannon limit with systematic editorial augmentation.
If they can't find the Holy Grail of search engines, I can direct them to this alternative that is just a Grail Shaped Beacon of a search tool instead...
This space unintentionally left blank.
This sounds like progmatically generated RDF(http://www.w3.org/RDF/), and could be handled via OWL(http://www.w3.org/TR/owl-features/)... Current search engines and metadata focus on content (which makes sense), but leave out context. W3C have been supporting research on Ontological networks, where semantics are collected and linked. The W3C concept is self-assigning meaning to the content. DARPA's Memex sounds like they are assigning meaning via a 3rd party. Less privacy, more meaning.
Have one without "sponsored ads".
Google has produce less useful hits over the last five years, as its advertising income skyrocketed. Search for something, and add -"photo", and you'll still get photographer. Try mens riding boots -women -womens -women's, and you'll probably get a "sponsored ad" for women's boots, with the letters "men" bolded.
This isn't even counting Target, which will claim it has anything you're looking for, but if you follow the link, oh, no, sorry... but we have !!!!!
Add the way librarians used to search, not just with AND and OR, but also "within so many words", so that men's leather boots, and men's riding boots and men's fashiondesigner wonderful boots all are returned, if you tell it men's within 3 words of boots.
A lot of the tech is already there... but ad revenue is, I mean, what we really want, right, rather than the service you're purportedly offering?
mark
My screen just came up in beta, slow to load & impossible to read comments. Tried clicking on "classic" at the bottom, it's been trashed also. I logged in today for the first time in ten years just to say Beta Sucks and I will be finding another site to read every morning with my coffee. I will be canceling my subscription...so sad but this "new" look with its intention run by greed and ego is NOT why I signed up for slashdot many years ago.
And then, let it find luckiest person in the world.