Ask Slashdot: What Features Would You Like In a Search Engine?
New submitter nicolas.slusarenko writes Nowadays, there is one dominant search engine in the world among few alternatives. I have the impression that the majority of users think that it is the best possible service that could be made. I am sure that we could have a better search engine. During my spare time I been developing Trokam, an online search engine. I am building this service with the features that I would like to find in a service: respectful of user rights, ad-free, built upon open source software, and with auditable results. Well, those are mine. What features would you like in a search engine?
Next to working well, maybe the assurance that not all your search queries were logged and sold to third parties or used for advertisement?
Confidentiality
Search for what I type in, now what you think I want. I'm so sick of having to change every search to "verbatim" because my search terms are being ignored. I'd switch to someone else but they seem to be carbon copies.
What made Google so great when it was still relatively new was the results were more relevant, i.e. they weren't just a bunch of advertisements. With the rise SEO that is less the case now, and looking for something on Google for me now means adding "-buy -purchase -price -shop" automatically.
First of all, I would make it so you can press the Enter key and it conducts your search. Forcing people to either tab or navigate their mouse to the button makes it a little annoying.
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
I'm happy to see an interest in developing steps towards and advertisement free future. I know we are addicted to ad money right now, but we should not accept that forever.
I'd like to see a completely open search engine that allowed people to download the search indexes freely so that they may create their own in-house appliances for search without the need for going through some proprietary site that may or may not be available in the next ten years or even months.
A site that promises to deliver you your privacy is not enough, because they could really be doing anything. Google promised us our privacy, and changed and deleted their old privacy policies even though they said that they'd always keep all copies of a privacy policy on archive. They went back on the word "never" and have continued to discontinue online services that people have become accustomed to with little to no notice.
A reasonably sized search index that is extensible based on what one is searching for would be great. Localizing URL suggestions, wikipedia caches, and other toolbar-suggestion searches in a networked work environment would all have benefits; the applications are almost endless. Freeing the shackles of search from a few could do so much for innovation, privacy, and security.
Sig: I stole this sig.
Probably 15 - 20 years ago I saw a news clip, possibly on Discovery Channel's Daily Planet, showing a group of grad students working on AI that can read a chapter of Shakespear, understand it and summarize it perfectly. I've never seen anything about it since and can't find any information about it online. I think they were at MIT.
If they can make the program understand paragraphs of text well enough to summarize it then they should also be able to make it be able to answer questions. I'd like to see a search engine with this type of technology. Feed it the whole Internet and have it directly answer your question rather than giving you ranked search results. Wolfram Alpha is the closest to that goal that I'm aware of.
The biggest challenge with a system like that would be separating the wheat (correct information) from the chaff (mainstream media government propaganda).
I have been playing around with meta search concepts for years (after doing many more years in enterprise/public search), but can't find a significant reason to produce a new public web search. duckduckgo does meta really well and does all the open/privacy stuff decently. Other than possessing a much better domain name, I am not sure what I could bring to the table.
This is Cuil all over again. No. Just no.
Best wishes for your project.
Take note on the lessons learned on the SlashDice Beta fiasco.
Buck Feta.
+
When I search I want relevant results, with out this you are worthless.
Above all, I want a search engine to not get in my way. You're the intermediary between me and my content. I *do not* want to stay on your pages, I want to find whatever I'm looking for and get on with my life
- Don't put the tiny search field at the near bottom of the page stuck right near to the button without padding. It's the only important element in your page.
Right now your giant logo is what takes more than half the page.
- Don't require JS to use. It's a form, it has no business *requiring* JS. At least have a fallback.
- Nobody neads a "clean that field" button. Everyone who clicks that button is going to do so by mistake.
- Work on your response time. If I'm waiting >3s while nothing's happenning, I'm already getting bored and slightly annoyed. I'd almost have the time to google it while waiting for the results.
- Add a description of what I'm about to click on. The relevant extract of info that gives me some context, so I know if you found my keyword in some totally unrelated improbable thing, or the right website. Just the title and url won't cut it.
- You really need to work on your results. I know it's easy to say. But I searched "github" (http://i.imgur.com/HSuaHAU.jpg) and not a single result was even close to github.com. If there's a domain with the exact same name, it might be a good idea to give it a little boost, chances are that's exactly what I'm looking for. I tried with "facebook", same result. But plenty of people google facebook everyday, and you wouldn't find it. It's not just any small site. It's frigging facebook and github.
Good luck!
Try searching for someone named "Beiber". Google might find him, but he'll drown in a million entries for some singer named "Bieber". But I did not search for Bieber.
There are many cases like this, where something rare has a name similiar to something more popular. Don't assume I mistyped! I rarely do. But if I mistype, I can search again. But I can't deal with a search engine that blatantly assumes I'm dyslectic.
And finally, let me search for source code snippets without turning up tons of irrelevant stuff. Spaces in an exact search is not separators - if there is no match, just say so. Don't assume I might want something completely different.
..and don't mess with my query. If I search for "the best saerch engine", give me the pages that have that string, no questions asked and nothing else. Oh, and while at it, make one that does regex.
And stop providing results that fail to have ALL of the search terms.
and nothing else.
Stop adding 'features' to things that don't need them!
In addition to Google-like relevance (which is a must if you are going to survive in this field), it would be nice to have:
1) Boolean search (cat or feline) and not (catwoman or cartoon or dog))
2) Date range which works (e.g., I want to search for websites talking about Enron BEFORE the scandal).
3) If I see a result that's obviously relevant, I'd like to be able to down vote it..
If you are doing any crawling or indexing and putting it on a PostgreSQL backend, you are doing it wrong. It is fine for meta and website management, but please use a real indexing solution for any search query backend.
...said the Anonymous Coward
I would like a feature that makes it possible to perform a search query without Javascript enabled, so I guess my needs are fulfilled by all other search engines than Trokam.
1. Search for the thing I typed in. Search ONLY for the thing I typed in. Don't search for the thing you think that I meant to type. Search for the explicit text that I entered, and nothing else.
2. Display only results that actually explicitly contain the text that I searched for.
3. Don't give me any of those SEO spam sites or ransomware sites where you have to pay to find the answer to a question. Shove those to page 99999 of the available results.
I want the old Google exact searches you used to be able to do +"exact keywords" so you can filter out all the sloppy, useless results Google has now. Since 2008, Google has been in a downward spiral. Any other search engine could have stepped up. Why didn't Bing or one of them fill the void that Google left when its search results got useless and sloppy?
Egg zact (hey, good name for a startup) searches are useful if you're looking up exact error codes. Sloppy searches don't work for that purpose.
I'm wondering what you mean by "Search Engine"? Do you mean a way to sort and rank websites? That's only part of what Google does. You may want to identify what is missing from Google before following the models of the past.
Amen, brother. Similarly, I switched from Lycos a decade+ ago because they dropped Boolean searching (some of us are power users!). I used Yahoo! next, but it was painful on dial-up with all the extra junk on their home page. Then I came across this new, misspelled site called "Google". I loved it; but lately it has been wearing on me as it panders more and more to the masses.
Note to Google: We nerds might be in the minority, but it is WE who direct the non-nerds as to how to set up their digital devices, avoid online trouble, choose their search engines, etc. Don't ruin it for us. I already started to keep one eye open for another search place, because I fear it'll only get worse.
To give me search results that accurately are what I meant to type and not what I did type. And a bonus would be if it would accurately know better than me what I am looking for an give me those results. In other words read my mind and correctly anticipate when I am wrong and still give me the correct answer.
It wouldn't respond to my request. I had to allow a jquery script. Then it searched but couldn't find 'Benghazi'.
Things have been lost from search. Alta Vista allowed search for 'word1' NEAR 'word2', which proved very useful. Google used to give information about its finds such as date, size, ('cached' is still there, but hidden) and some things so long abandoned that I can't remember them. You know why date is important; size is also important because a very large page containing your terms is probably clickbait. A great sadness for me is that Wolfram Alpha is so wrapped up in fancy scripts that I've never been able to use it with my fairly secure Firefox (oh, it's better today).
Accurate reporting would be nice. I'm looking at a Google result that claims it found "About 54,100 results (0.46 seconds)" when actually there were only 245 unique results.
Location would be nice (maybe a flag icon from that country). An opportunity to vote the relevance of a result up or down and maybe indicate something inappropriate. Wildcards would be incredible. Apple's Spotlight search engine can now search the internet as well as local files- maybe your engine could take advantage of some sinister simpatico surreal symbiosis.
We need a fresh approach after a long period of stagnation. Who knows what clever innovation has been missed?
...omphaloskepsis often...
Also, filter out all the pages-of-links search spam
If you want to charge a fee, you could include a link to someone who is better at searching for stuff than I am, or maybe Watson.
Finally, include all the internet that Google hasn't indexed.
Have you seen this list of some of the few alternatives?
Being able to say "find 'blah' when it is within X words either side of 'bleh'......"
When I open your search engine, I want the focus of my cursor to default your search form.
After I found out that you didn't even have this, which requires no more than one single attribute in html, I didn't have the confidence to go to any further. Usability testing is cheap. The idea that you would forgo any kind of basic usability testing, before asking for feedback from Slashdot users, tells me you don't have the experience, nor the real desire, to make a decent halfway usable search engine.
Often, I'm searching for one thing, and I discover that the words I'm using match something else, like a band or celebrity's name, that swamp the results I'm looking for.
To the extent your data model does clustering, it would be really nice to be able to show clusters of results so I can find what I want among the substantially-similar.
For example, if I search for something commonly sold, I'd like "sites that sell it", "reviews", "how it works", "how to make your own", etc. Labels seem difficult to automate, but if you can just manage the groups, I can figure out which ones are of interest to me pretty well.
The same problem comes with Wikipedia and the dozens of sites that crib from it. From a user point of view, they're all the same result. Perhaps I want to find "who's copying from wikipedia" and explore that cluster, but I'd rather they all get grouped together so I can skip over them.
Make it easy for me to specify I am looking for technical information, or looking to buy something, or what have you. All too often I am trying to do a search for technical information, but if that acronym has also been used by Beiber lately I am SOL. I would love it is I could weed out the pop culture hits when I wanted to omit them.
Similarly I would like a search engine that I could easily specify if I also want hits for related words, or just EXACT match, and whether to ignore capitalization or not. It is maddening when an acronym also happens to be a common word and I get flooded with useless crap.
1. Return first the results which exactly match the search terms.
2. Do not include results where one or more of the search terms only exists in an advert on the page.
3. Re-introduce a feature which an early search engine (I think it was AltaVista) where you could specify a search term to be 'near' another.
4. (more important in languages other than English) Allow you to specify that any tense, person or case of a search term be matched (eg if searching in French, *aller would match any of 'vais', 'vas', 'va', 'allons', 'allez', 'irai', 'allâmes' etc)
5. Allow you to restrict the results to those where he search terms are actually rendered on the page when you follow the link.
First I want a toggle that I can never ever ever ever ever see that domain in my search results. So ask.com answers.com experts-exchange.com huffpo and especially Quora you fucking turd pile of shot Quora; I never want to see Quora again in my life.
When I want a little pizza joint or some place that hasn't hired an "SEO" guy all I get are page after page of directories derived from some government database or some crap like that. Their actual page is bottom ranked. I don't want review sites. I don't want anything that was assembled by a machine.
So a simple rule of thumb is de-list any page that offers to "upgrade" someone's listing. Full stop. Also I want a toggle that will remove listings that have any version of "upgrade to our pro service" Literally they could cure cancer but offer to cure it 1 minute faster for 99 cents and I don't want to see that page.
To me right now nearly the entire search results are like going to a dating site and only finding hookers. Some people would argue that they "need" to make money but they don't. There are lots of pages that exist for a specific reason and many of those pages are commercial, as in they offer a specific service such as a pizza places where the page is about their pizza place. Short of the recipes the page is 100% free. But I don't want some shit "Just Eat" website. Maybe they can link to the other page but I really don't want to see it ever again. For instance I loved allrecipes.com. But now it is just upgrade upgrade upgrade upgrade. Some will argue that they should be allowed to make money but quite simply the site existed before some MBA took over and "monetized" the site that's fine, I no longer want it to turn up in my search results. Don't ban it from the entire search engine, just ban it from the search engine when you tick the "No upgrade sites" option.
The other thing that I would kill for is a negative feature option. So any site that uses discus would vanish from my search results. Those scumbags need to burn in hell and I would love any search engine that sent them there.
To me there is a huge opportunity for some new search engine to do to Google what they did to all the others 17 years ago; completely make them irrelevant by brutally ignoring the wishes of the larger websites and completely focusing on the needs of the average user.
able to find more results by ignoring robots.txt
does not / technically impossible to remove or censor results e.g as result of DMCA request
Google has been ever-more pissing me off with its "sponsored" results in which I am almost never interested... I have to go further down the page to get the things I want.
Related to this: Google's recent proposal to post "truthy" results before others. Just no. I don't want or need a nanny-search. I'll judge the results for myself.
As far as I am concerned, results "filtered" or sorted according to Google's idea of "truth" is little more than a rather transparent effort toward censorship.
no bullshit on the search page
an "i feel lucky" button
now, if anybody could satisfy both of those...
Google!!!!! Do not reinvent the wheel. Google can be a wonderful search engine. How in the heck can a new product provide the variety and depth of search that Google can with their enormous data base and ample hardware?
...all meta shopping sites.
I'm at my wits end when alibaba or ali express or kelkoo or tengo or whatever is in the top five of EVERYTHING I search for. I don't ever want them to be even in the top 1000 unless I explicitly type "Meta shop" or whatever.
Apart from that one filter, just search for what I fucking asked for, not what you think I might have meant.
No, your children are not the special ones. Nor are your pets.
Syntax language must be clear and present, google does this but hides it abit. Others place documentation in other out of the way places on the page, difficult to find.
Option to search with/without commercial interference. Option to search encrypted against NSA (et al) spying.
Option to search via Artificial Intelligence or via literal string search (like the different between a general "show me a duck" and a litteral "nodejs npm commands"
Search results for ALL (voice, video, files, text, EVERYTHING) on one page and categorized with thumbnail previews.
Removal of any and all filtering. It is time we looked straight into the brown void of humanity instead of pretending we are a collection of sitcom actors, people controlling my mind piss me off something fierce.
Search that actually goes into the page instead of just reading the text that's there stock. Web developers right now create very inefficient sites because they want 'SEO' which is a stupid buzz word for a search ranking. Sites must be hard coded into server file directories so that bots can read it, a proper site would have the pages as an array of javascript memory objects because it's hundreds of times faster, but the bots cannot read it. Currently the only solution is to hard code a chunk of your first page, create inefficient hard coded pages, or create dummy hard code pages not used that the bot can see. There are some solutions but the problem is writing code into the HTML in the first place.
I find it most annoying with Google last several years that they mangle the URL they send me to so I can't easily change it to the parent or higher level URL.
When I search, I don't mind ads. I mind malevolant ads. But I mind malevolant sites more. One is solveable. Generically the site that has the most to gain through advertising pays the most. And I generally benefit from those. But SEO is more insidious. They are trying to gain my attention from generally more appropriate sites by gaming the system. If the system is evolving correctly the best most appropriate sites get selected automatically. The major search sites recognize these things. Because if they stop providing relevant sites other sites will come along and become more relevant. I am not going to use your site if there are no ads. No ads means less relevant searches for me. The site with the most relevant search wins. Hands down. Everything else is meaningless.
Nearly all search engines still think web pages are static, or generated at server-side. That is less and less true, and many web sites are now single-page applications fetching their content dynamically using AJAX requests. A search engine should search in pages as a real person sees them, not as a robot ignoring JS and CSS see them. It's a shame that all SPAs on earth have to generate a static version of the app using their own robot just to please stupid search engine robots not able to do the same.
So say I'm searching for something with really common words in it. I can't think of anything specifically right now, but this is my most common search failure.
I get back a bunch of results. They have all the words I'm looking for, but they're all about two or three more popular topics. I'd like to be able to select a search result and tell the engine that results like this are incorrect for some semantic reason. Maybe it's a band name and I'm looking for a book titleâI should be able to say I don't want anything related to music. No bands, albums or songs. I'd be willing to tag results with some context to provide hints to the algorithm.
Things like 'windows' tend to mess up results; Google assumes that I either mean the Microsoft kind or the house kind, but sometimes I'm trying to figure out what's wrong with a particular application window. I run into this sort of thing surprisingly often.
Specify search areas:
[X] Public internet index.
[ ] Torrent sites only.
[ ] File database of dubious legality.
[ ] Archive of device drivers that actually work.
[ ] The Dark Web, whatever that is.
[ ] Data sheets and manuals.
a query language would be nice. Ex. Find all pdfs with author names that also show up on page x.
I would like a search engine that could show me the top categories for my search for results. This might be beyond your goals but it is what I would like
For example I was search for program that would deal card
If the I had the word "deal" in the search I got 1000 shopping sites in a row. But I was not looking for shopping. All my search terms were common and could be used in conjunction. So card and deal together would get business cards or greeting cards. If an engine could provide some sort of organization. I could then drill towards what I am interested.
How the engine would be able to do this I am not sure. Obviously semantic web is one avenue, but the search might recognize lots of links in common or it might look at supplied key word or common terms on the pages
Google had something a bit like it (in labs only I think) but it was buried and not a simple choice about how to present results.
Without good results, it doesn't really matter about the bells and whistles. I use a search engine to find information, so it better do that extremely well. For example, I just couldn't stand using DuckDuckGo (aka Bing) because of this, and went back to Google. Bing consistently failed to find information the information I wanted, while Google had it on the first page.
So, after your engine returns as good or, ideally, better results than Google, you can start thinking about other features.
One feature I'd really like is to be able to tweak my result set. Something like if I search for "AC DC", I get a bunch of results about the band "AC/DC". That's not really a bad result given the input, but in this case I was after an explanation for the electrical terms.
So I'm thinking some ability to mark one or more of the results I don't want and say "not pages like this", and it would cull those talking about the band, in a weighted manner. Or some other way to help me find the information I want when I search for some ambiguous terms.
I'd like advanced image searching features. Searching images by the text contents of the page they're on is bogus and probably google's biggest weakness. Images should be searchable by content. The means recognition of image contents (non-trivial, unsolved problem), but it also means some easier to realize stuff. For example, good OCR would go a long way, bonus for handwritten characters or characters on a noisy background. Search for webcomics based on words in the dialog or image macros by caption. Image searches that accept images as input. Google does this now but very poorly. Recognizing when that input image is part of a larger work, either a cropped section of a larger image or a frame from a video. Suggest similar images based on that info.
Search engines should shine a light on sites that show different results to different users, maybe its for commercial exploitation (GEOIP blocking), or political propaganda or whatever.
Search engine show allow users to run crawlers in coordinated distributed manner, this helps users have privacy, it adds extra noise to surveillance systems, it might give users deniability as to their intent to access subversive material. It it should help with the first problem.
That does not filter porn even without safe search off!
Let's say Linux kernel 4.0 just came out. If I search Linux then my top result will be a previous kernel version. What I'd like is a search engine that could put the most recent result first, not just the one with the most links to it.
I would like to be able to indicate that a search term should not produce a hit, but should not disqualify the page either.
A search engine must not rely on javascript for simple things like submitting a form.
Yes, I realize it would probably be combinatorially explosive and break the search engine's platform, possibly even the Internet, but I have long wished I could submit a regular expression as a search term on Google.
I've started drifting away from using Google/Bing/whatever, in favor of loading a bunch of site-specific search engines into my search bar.
So if I'm looking for, say, a specific Magic card, I don't let Google search the entire net, and find everything that happens to say "elvish mystic", giving me a ton of irrelevant stuff (even searches like "mtg elvish mystic" bring up pages to buy one instead, which usually don't have the info I'm really looking for). Instead I click an extra button to go straight to Wizard's own database page for it.
Repeat that idea for about twelve different gaming wikis, plus Wikipedia for general knowledge, and you'll have the contents of my search bar. If I were into different hobbies, I might have similar search engines for those.
A single search engine that can figure out the context of the search, then go straight to the experts for that context, would be one way to do better than Google.
Seriously: words suck at describing or conveying a lot of things. And half of these comments are just people who don't know how to use a search engine properly. They can't read your minds [yet] and we are using the equivalent of charades to try and explain to a computer what we're looking for.
Until we can directly share a mental image with a computer, we're going to have to deal with crafting search queries.
Here's the tool google uses for extracting meaningful terms from pages with Microdata or RDFa Lite attributes:
https://developers.google.com/structured-data/testing-tool/
Get rid of the sites that have a single paragraph and then a registration or paywall blocking the rest of the content.
Get rid of the sites that are just copies of other pages with ads.
Or let us easily block a site from appearing in results in the future. Enough users vote a site off, have a human take a look to see if they should remove it for everyone.
Privacy goes without saying, of course... but if I put a period or a comma in my search, I damn well meant it to be there. Pay attention to it.
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
What features would you like in a search engine?
From a (new) search engine i would like to be ONLY search (NOT a "recommendation" or a "search and recommendation") engine (like the -very- old search engines)... and booleans, and fucking apostrophes/quotations/parentheses/bracket! Just that please: A FUCKING SEARCH ENGINE is just enough!
Thanks.
If I'm looking for information on someone who lived and died in the 19th century, I'll end up with a bunch of results that are from "people finder", "background check", "white pages", etc. sites. I'd like an "exclude this and all similar results" button to clear all of them out and get to results that are actually relevant to me.
When I do a search on just about anything then not to see x rated material show up.
Also want results by date.
evil.
... they could use a much better name
That "trokam' moniker does not make any sense to 99.999% of the users
Want to be able to specific more about the source e.g. newspapers only, blogs only, academic only, published in 2007 and so on. Ads I can ignore.
Hey, look!!! Another geek with a napkin to write a business plan on, with one exciting feature that completely rips out any possible profit, and no other usable resources, especially workable webcrawling tools, scalable databases requiring petabytes of storage, servers scattered around the world to do the actual web crawling with so local bandwidth doesn't swamped, security of the systems, or any of the 47 other problems a real business would have to deal with.
The Dotcom is back! Woohoo, I made *so much money* cleaning up after those clowns!!!!
Regex & Boolean!
I have the same complaint that a lot of the complainers above have: results that attempt to "correct" your terms, and you end up getting a lot of irrelevant results. Thinking about the issue, I think and am pretty sure that some search engines end up giving the most "searched for" results. They lean toward 'popularity.' Thus, you get results based on mass population hysteria. On possible solution to this problem would be the ability to turn off or on "similar searched others have made."
Here is another problem I have. I search for basic information on cooking. I want to know the baking time for pork chops in an oven. A lot of the results, the vast majority, are recipe sites with lots of fancy recipes, and finding out how long and at what temp to cook pork chops in the over, you have to weed through 40 other ingredients and sauces.
This might help. When the results are displayed if you have bar or box on the screen that gives you some simple options, that you don't need to be a nerd about. Something like" "results are too technical" or "more technical" or "exclude News reports and current events". Again, the idea is that these options to better aim the search can be given both before and after the search is started, bit presented in plain language everyone can understand. (Yes I understand Boolean search, not that it helps much anymore!).
A home page that doesn't take 15 seconds to load while it shoves analytics and facebook and ads down my throat?
Search for what I type in, now what you think I want. I'm so sick of having to change every search to "verbatim" because my search terms are being ignored. I'd switch to someone else but they seem to be carbon copies.
I'll second this, if I type in FORD THUNDERBIRD, chances are I am searching for information on a specific automobile type and not native American mythology.
I also don't want every F-ing web site that has the word FORD on it somewhere. Low quality click-bait sites (mainly porn, dubious finance, spamvertiser, or driver-by malware pushers) are notorious for this, having a block of text at the bottom with common search terms so they show up in searches they shouldn't.
Likewise I don't want to see the same site listed more than once, If I search for "cats with hats and bats" (it's a video poker machine) I don't want to see the same Vegas hotel web site come up a dozen or more times, once is enough.
An "expert" mode that accepts regular expression searching would be a nice touch as well, so I can search for "this AND that BUT NOT this OR this".
Limits on searches would be nice for some people as well. If I'm searching for information on current tax laws, those change every year so if it is five years old it is probably worthless.
I would also like support for blacklisting (I'm surprised that modern browsers don't have that as a plug-in). If I run across a scam site (like ancestry.com which advertises free info that actually costs you $20/month) I want to click a box and NEVER see that site displayed again, EVER. Unless I delete it off my blacklist.
Highlight other sites of dubious value. There was an old search engine, I can't remember the name, that prefixed all pay sites with $$$. Highlighting ad sites and known malware pushers in red would be nice as well.
Some of this is probably not even technically possible, but some of it is. I'm sure highlighting known malware pushing sites would probably be legally risky, and blacklisting would probably require help from the browser.
Standard Stuff exact String Search within quotes fuzzy word search, specific site search, filetype search limit result to date range
DO NOT TAKE THE PHRASE APART Just return no results
Query websites not well connected to the main commercial internet
Special requests
I have a machine crawling 24/7 using open source web crawler just for my specialized scientific, epigenetic , and materials searches
I'd like a search that actually searches exactly what I type and not what it thinks I might mean.
Including punctuation exactly as I type it instead of ignoring it.
Search features I depend on:
* Non-English characters. Handle multiple encodings of web pages and URL-encoded characters in search queries.
* site: to search only within a domain. This is often a national domain, such as "site:co.uk" to search only British sites.
* Minus: Begin able to block certain words, or sites.
* Plus: A word prefixed with a plus is required.
* Quotes/hyphen: Searching for exact phrases. "Java class file" is different from "Java File class".
Where current search engines are lacking:
* If there is a period between the words then they do not belong to the same phrase. (A search for "Hello Google" should not return "Say Hello. Google for it." as its top result)
* Use word order in search query to weigh how important a search term is. Rank pages higher wihen those words are closer together.
* Don't correct my spelling by default, assuming that my search query is in US-English. (I am speaking to you Duck-Duck Goo!). I can spell, and I do not always write English. If I misspell then that is my mistake, and sometimes I search for a brand name that was misspelled intentionally.
* When indexing a web page, identify what is the important text on the page and ignore the rest. For instance, on an internet news site, the text in the articles is most important. On a forum text inside the comments. On this forum, articles followed by comments. What people have written in their signatures is not important. Slashboxes are not and ads are definitely not.
It is aggravating when you use Google on a collecting site and you get every other page on that site in every search result because members have listed their collections in their signatures.
If I search for the word "review", I don't want every page on every web store that has a Reviews tab.
Pages on a site often follow a certain pattern - find that pattern to find which text on each page that is the most unique.
"We mustn't be caught by surprise by our own advancing technology" -- Aldous Huxley
Did you not see what the Crashcourse guys did? Make a donation if you can afford it and if you want to, but they understand many people won't/can't, and that's fine with them.
For example, I'm not paying you to read your post. And you're not paying for this reply. Not everything that happens in the world requires silver to cross palms.
Money, money, money. With you it's always with the money. There are other ways, McScrooge.
Why won't this useless search engine tell me the best place to ford the thunderbird river?
Think of Google's search as your type as 1-dimensional suggestion list. I'd like as I type to see around the search bar a matrix of categories: news, videos, documentation, blogs etc. Then as I hover over a category with a mouse I zoom into a matrix of subcategories for that category using the mouse wheel. I zoom out back one level if that's not the branch I'm thinking of.
In addition, I don't want to click until the very end, and maybe not even then. Hovering over a set of results shows me what's at the deeper level, and when I'm looking at a one or a handful of pages that match the criteria as I refine further, it is also shown as a cell. Hovering over it will give me a preview -- from the search engine, not my browser fetching an actual page. Only when I'm certain I want to go there, I'll click.
That would be a search engine of the future. Or, idea #2: make it like google, but when I control-clik on the link for the page it opens a sanitized copy of the page, provided by your server, so I know there are no scripts or malware and crap. And if possible give me that sanitized preview when I hover over the page so if I'm lucky I don't have to click on anything at all.
I know sites wouldn't like it but just saying what I'd like to see that I think is technically possible. Thanks for listening!
But if I want to view the poems of Emily Bronte I don't want 100 gazillion results from Amazon.
Just like I use NoScript and AdBlock+ so I want to cut out the shop windows. If I want info from the web then I don't want canned waffle.
I do not want the search engine to try to figure out what I mean. I want it to find pages that have the terms I typed in them. It annoys the piss out of me when results pop up that don't contain the terms I had searched for. These results are completely irrelevant to me.
Also I HATE it when a search engine says "oh did you mean blah blah?" NO. I AM NOT STUPID. DON'T QUESTION ME AND MAKE ME ASSERT MY ORIGINAL QUERY.
Modern search engines are darn good. The problem is not that they lack features. Almost all requests I ve read so far CAN be achieved with Google provided someone RTFM. Moreover suggestions, basic natural language processing and even search 'bubbles' are convinient and DO help the majority of people find the intented result in the first few results. The problem IMHO is the existence of single point of failure by lack of diversity. And as Murphy law goes, if something can go wrong it will eventually go wrong
Computers and internet access cost money. While we would all love an ad free search experience, it is impossible to make that happen. If you think it is possible then you are just an idiot or need to remove your tinfoil had and take a real look into reality.
As far as the original question. I would like the ability to adjust my search/ad preferences. Sometimes I search for a one time need or maybe a need for a partner/friend. This sometimes incorrectly defines what I am looking for and I would like the ability to say so. That ability would lead to better search results and more correctly targeted ads.
I'd love to search using regular expressions, failing that, at least a much more precise way of indicating what must and must not be in the returned results.
--JLockard - "Some mornings, it's just not worth chewing through the leather straps." - Emo Phillips
It would be very useful to be able to control what the search engine thinks I'm actually searching for. Taken from: http://unqualified-reservation...
"Free software as in beer, copy protection as in racket" - Telsa Gwynne
This goes for all programs and websites.
"Near" keyword, logic constructs, all those nice features AltaVista (which was just a hardware-demo) had, and Google never managed. Google is borderline unusable these days and you strongly notice they do not care about good search but only about placing their adds and profiling you.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Google by now should have ton of data on its users. Why not list about 3 possibles searches that user is predicted to search for on the home page. For example it is almost lunch time and I usually look up menus of near by restaurants. Google can predict which restaurants I'll choose based on previous searches and location data. Google does something similar to this with Google Now but they can build on that more.
I'd like to be able to search for regular expressions, or at least boolean. I'd like to be able to specify the context of a word, eg Java (geography), Java (coffee), Java (programming), or similarly the context for the search.
Don't waste your vote! Vote for whoever you want, unless you live in a swing state it won't matter anyways
1) Exact string matching. As an example, if I search for " 'x.25' " don't give me hits for something with dimensions of 45mm x 25mm.
2) Allow more complex search constructs . For example I'd like to be able to specify the search term " 'x.25' near protocol -handbook ". You can sort of do that with Google's Advanced Search, but it's extra steps and you still don't get terms like 'near' or exact match.
3) Bonus points for boolean constructs such as " (lions or tigers or bears) near woods ".
In short, provide a robust search engine that will support meaningful search terms that can be used for more than shopping for a new TV or figuring out who stars in your favorite reality tv show.
a search engine that searches the internet. Not parts of the internet, all of the internet.
I'd also like the search engine to do Boolean and regex.
P.S. I couldn't give a flying fuck if it:- has ads; tries to profile my search queries. I can at least attempt to get avoid those things. But if it does not index the entire internet it's as useful as a range of shoes that consist of one size and one style only.. And no, I don't care if it doesn't come with a free set of steak knives and is 100% dolphin free and kind to puppies, as long as it indexes everything
I don't care if Bill Gates wrote the back end, hell, I'd use it even if it was run by the scumbag behind DuckDuckFuckADuck.
Thanks
Ability to right-click (or whatever) on a listed result and mark it as "I never want to see this site in any search I run on any topic ever" (useless result) or right-click on a listed result and mark it as "The content of this result isn't relevant to my search, block this page and all others like it from this search so I can find what I'm looking for" (irrelevant result/bad context) and re-run the search.
An open DECENTRALIZED search system would be way better than a system that's simply built on open source. Hell, for all we know maybe even Google is completely on open source. But the data sets that seed the search engine, without which the algorithms are simply crunching meaningless strings of letters, are kept close to Google's corporate bosom.
What we need is a search engine where everyone that searches can have access to the entire data set if she or he chooses to do so. This is similar to the way the Bitcoin blockchain works. Everybody can choose to have a copy of every Bitcoin transaction ever made, or if they're lazy or don't have the computing resources connect to a full Bitcoin node using something called a light wallet (which downloads only the relevant parts of the Blockchain related to the transactions made using a certain Bitcoin address).
So let there be a basic light version of your engine and a full version. If that's not feasible, maybe you can make an advanced client that processes only parts of the complete data set, but is distributed in such a way that the parts can easily be combined into a complete data set.
I'm sick of google adding referral parameters to all the urls that I click on from a search. It often adds 10 seconds or more (sometimes a link never comes up until you click it again). For that reason alone I switched to Bing, which I though I would never do. At least Bing doesn't fuck with the links.
How about a context switches? One for purchasing/acquiring, one for history, one for technical details, One for and people related to an object? When I search now, I have to search between buying opportunities and buying opportunities and more.
Time for a new Political party in the US (or two!) One is off the rails Other cant pony up a leader.
I'll make my own with blackjack and hookers
https://www.youtube.com/watch?v=e35AQK014tI
Just look at all the features that Google keeps removing and add those in.
How about a token at the beginning (perhaps "++") that declares that every word must appear? (implied +word +word...)
How about a token at the beginning that declares that every word must appear in that order? (but not adjacent)
One thing I know I'd like to see, regular expressions, is probably prohibitively expensive, but it sure would be nice.
J
I frequently search for a business and google displays the name, address and phone number in a very nice box in my browser. There is no way to move that information to my address book though. It seems like google could have a button to push a cvs file to me.
The "anti-safe search" image search, which filters out all of the things that aren't porn.
Come on, be honest.
Karma: Terrifying (mostly affected by atrocities you've committed)
I want results that make sense for me. If I search for 'AWS' I want Amazon Web Services. If a professional welder searches for 'AWS' they will want the American Wielding Society. If I search for a the name of a place I want the place near me, not the small US town that copied its name. If I search for an error message I want the latest results first, I don't want email list posts from 10 years ago.
Also I want the search engine to care about my privacy, which sadly contradicts what I just described.
I want boolean searching on -only- what is visible on a page. None of that metadata stuff. That alone should bypass all those search conglomerators. I don't mind advertisements on the side, but not mixed with the results. I also want the results based only on what I searched for, no paying for higher rankings. That sounds simple enough...
"features that I would like to find in a service: respectful of user rights, ad-free, built upon open source software, and with auditable results"
Well, well, well. For me there's only one single feature of a search engine that makes it a go or a no-go: have a damn good indexing engine that can provide relevant results in a timely manner. Everything else is just a load of crap that I will never care about. If I can't find what I'm looking for, that I couldn't care less how it protects your rights or whether it is open source or not. Oh yeah, about that auditability... forget that. I don't want to find what other people think I should find, I want to find the best match for my queries. That said, good look, develop away, maybe you'll indeed make a better indexer and ranker than Google's and we'll be all better off.
I am putting myself to the fullest possible use, which is all I can think that any conscious entity can ever hope to do.
I posted a request on Google Groups, years ago that Google implement a blacklist of sites, sites that I don't want to see in search results.
I despise results from other search engines, such as ask.com, scribd.com, and a dozen others that harvest other's answers and barrage me with a second layer of ads.
Google did implement "blocked sites" but eventually it removed it. Personally I would think the blocked sites list would be invaluable on a server side "page rank" algorithm. The more users blocking a given site, the lower the sites rank.
Other than the obvious features you've mentioned in your article, simply stop returning other search engines and rank original content higher than harvested content.
Ability to search all blogs would be good.
currently there's restrictions on time and range of coverage
a) stop tracking my browser and devices everywhere we go on the internet. Stop it already - google, doubleclick, analytics, facebook, twitter, inst-fuck, whatever.
b) I'd like for my searches to work for other people. The same question should return the same results.
c) privacy. If I search for the anarchists cookbook, show the results and forget my search.
d) never search social anything. I simply don't care about that drivel. Let me blacklist websites that I never want results from - you know them - the sites that are just too hard to use like cnn, fox, abc, cbs, and 99% of the old tech magazine websites.
Privacy. Search by country, by date, by site. Enable quotes to force searches of "text in an exact sequence". Ads are OK.
I have come to dislike Google... the ads, the product placement, the why aren't you using chrome and signed into Google Plus? But Google still has better search results than Yahoo, Bing or DuckDuckGo. And when I mean better I mean they actually get me relevant results. For example search for "Kyocera FS-#### driver" on Google you get the Kyocera product page on the first page of results, on the competition you get a page of driver collections ladden with viruses. Search for "Cat Climber Plans" heck search for any kind of DYI plans on any search engine and look at the garbage that shows up. So good results are what would get me to switch.
Why doesn't google allow me to search for pages in some languages only?
Seriosuly, I keep trying to use DuckDuckGo as my main browser but I constantly have to switch to Google because of this.
For most subjects, the time of the creation/indexing of the page is not that important, but if you are programing or if you are doing scientific research, you really need to be able to filter the most recent results.
This is what the semantic web is supposed to bring us, but we can do much more with what already exists.
A great deal has been achieved in AI and natural language understanding over several decades.
No doubt Google are working on it all right now...
1) Exclusions = in a search on X don't give me anything about Y
2) More control on match criteria (i.e. words must appear in sequence vs. can appear anywhere, frequency of word matters, title or keyword bonus up or down), data ranges or data range weighting...
3) I'd like to be able to indicate type of search: news, shopping, academic (i.e. give me papers), physical location...
4) Better handling of non-English words (give me English articles with this Italian phrase vs. give me Italian articles with this Italian phrase)
I'd like it to have an integrated init system. Or the other way round. Whichever you prefer, Herr Poettering - you're the boss.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Allow me to hit enter to search instead of clicking on the search button. And dont sell my search data. And make it work well
One of the biggest annoyances of searching in English is that the results are coming from gazillion places. Sometimes that great, but when I look for lumber yards I really don't care about the ones in England, Australia, or New Zealand. They all might be fine businesses and a pleasure to deal with, but I doubt they deliver to the northeastern US. What also would help bilinguals as myself is to set two (or more) preferred languages. I speak and read two languages fluently and I don't mind and often want search results in both languages. Lastly, making it easier to search within results will be great. I know it is already possible with some search engines, but it is not easily achieved.
How about: search for exact word or phrase (eg. "within quotes") it is so frustrating that duckduckgo etc don't allow this. No padding results with false positives! Toggle as many options as possible, eg searching for alternate forms or spellings; numbered results; let the user have maximum control over the results.
Erm, bit of a stupid post isn't it?
He's saying everything that duckduckgo already does!
* completely privacy orientated;
* It's open-source search engine allowing anyone to build upon it ( http://duckduckhack.com/ ).
* no ads, or basically non-intrusive, and respecting the users privacy;
* and a ton of add-ons that the community has build for "instant results".
https://duck.co/
I numbered my "features", but they really are in random order.
1. Don't assume that I entered a partial word or that you "know" better than I do what I want to search for. Specifically, if I use the search term "ord" I do not mean "order".
2. Give sufficient context into the results that I know how the page uses the terms. Having the context be part of the links going off the page is of very little value. Specifically, back to the "ord" search above, returning "http://ad.doubleclick.net/jump/%sitename%/blog;pos=%pos%;ord=123456789?" is useless.
3. Only index relevant stuff. See above ad.doubleclick.net example that should never be counted as a hit when searching for "ord".
4. Use https
5. We're addicted to speed. Results need to be returned in a reasonable time frame.
6. If I type in my search results and hit "Enter", take that as hitting the submit button.
7. Renaming the "reset" button "clean" seems like a needless change in terminology.
8. Advertisement that is relevant to the search THAT DOES NOT TRACK ME is tolerable as long as it is clear that it is advertisement. If I type in "tents for sale" I'm kind of asking for advertisements.
9. Don't track me. Don't remember me.
Google can filter by date, but it would be much easier for me if they could order by date, or at least put a greater weight on timeliness when deciding relevance. (This might have saved them a lot of bother with that whole right to be forgotten mess).
I'll also add another vote to being able to filter out the shopping sites when I'm looking for technical info
...on knowledge domains.
When Google first came out it was a wonder. It saved me an enormous amount of time (=money) in locating information I needed. But it rapidly deteriorated. I remember when there would be people who rejoiced that X billion new pages to their database. I found that with each massive growth of the site it became harder and harder to find the information that I wanted.
I am retired now, but at once upon a time worked develoing applications in Cold Fusion. It was often easier to location a bit of information by enterin a description of what I wanted in a search engine and finding the answer as opposed to pulling out multiple books and checking the indexes. Data when searching the term Cold Fusion clustered in three areas - the application programming language, articles about nuclear energy, and long dissertations that were basically crackpots babbling. For a while it was possible to narrow the search by doing searches of comp Usenet newsgroups, but Google killed the facility of that when they smashed their Google Newsgroups into the mix, and the New York State attorney general then killed Usenet.
But the idea of knowledge domains impicit in the Usenet heirarchy would be very valuable if it could be applied to the internet. Usenet kept control of which groups could be added to 7 of its top levels and alt was free-for-all. Instead of searching the comp.lang.coldfusion Usenet group it would be good to have a search engine at http://coldfusion.lang.comp./ [coldfusion.lang.comp] Whatever organization controled the site could determine what web sites were worthy of being included in the searchable database. All of the automated spam was a major problem on Usenet. Having control on what sites to cover would go a long way to alleviating this problem.
Of course, I have no power or influence about setting this up. All I know is that as far as I am concerned, the internet is fundamentally broken.
Not a web-page retrieval search, but a solutions retrieval.
I'd like to start with a query and have the engine then ask me some questions. More than disambiguation, I want it to discern the breadth and depth of my knowledge along the answer line. Then have the engine teach some basic (missing) foundation and fill in my holes up to the topic.
We could dialog back and forth, exploring solutions and arriving at the best ones for me.
Examples:
"How to provide a smart power grid for distributed 2-way customers?" Which would work into monitoring and anticipating power generation/draw, placement and sizing of transformers, capacitors &, conductors, surge and safety, redundancy, weather, balancing, and even pricing models.
"How to minimize crime?" Which would work into size and type of scenario (home, business, city, country), laws. Models of culture, education vs crime. What do you mean by "crime"? Minimum levels of freedom, police, education. Even neonatal nutrition and care and their impact on crime.
Why won't this useless search engine tell me the best place to ford the thunderbird river?
It would, in English there is an actual difference between a small f and a capitalized one.
Many search engines have a "safe search" feature to hide NSFW material but no way to do the opposite (NSFW only). This has to change.
nt
Today's search engine like to optimize results for you. So you get the results based on past searches and clicks.
This puts you into your own personal bubble. You only find stuff you already know because you only get links relevant to your personal knowledge field. But what you actually search is a different view on the world, not your own view.
To escape that, you would need to remove all cookies and even change to a completly different PC or ISP or country.
This should be a user user accessible feature. Sort of "amnesia search".
Atari rules... ermm... ruled.
I would like to see the search engine not use any algorithm to give me my "PERSONALIZED" search results, as I feel that will create a technological bubble and I,the user will be limited to that bubble and not be exposed to anything that I object or do not agree with(which is basically all websites today).
I want to be able to filter out any results that have their content spread across multiple pages. I'm sick of slideshows, sick of having to click through pages. If the content isn't on a single page I just go for the next result.
Google is far worse than it used to be, signal-to-noise ratio. Part of the time, it does not appear to respect quoted search terms, even with a "+" in front. I now frequently see part of a word that was quoted, bolded in the non-sponsored results.
It also refuses to allow explicit literal searches: I have an artist friend who uses a period as part of her name - google says "nope, not gonna look for name. lastname, I'll ignore it and look only at name lastname".
Finally, I've found too many times in the last year that one of my search terms isn't even mentioned on the page from the results.
mark
https://www.google.com/insides...
https://github.com/trivio/comm...
If you have type ahead/search predictions/whatever enabled, and I type half of my search query then you predict the other half, then I continue typing *exactly what you have in your predictive search result* it should NEVER remove that predictive search result.
How is matching what you were suggesting more exactly lowering its predict score? how?!
It should respect my privacy, like DuckDuckGo, it should set by default don't track me and should not keep my cookies. It should also devolve part of his profit to the nature like Ecosia.
Google can be uncanny when it comes to finding what I'm looking for based on what I ask for. But I miss AltaVista and some others with which I could find seldom-visited sites in the dark corners. I want a search engine that will return the best matches to what I ask regardless of their popularity. I realize that it was getting hard to avoid all the spammy sites, but maybe the search items returned could be accompanied by a trash rating. If during it's spidering, it finds a lot of trash-looking links or malware from the site's pages, that number would be high. If there are no links outward and no signatures of malware from the sites pages, that number would be very low.
(almost anything) would be better than the ad-driven nonsense that passes for an Internet search these days
So any site that uses discus would vanish from my search results.
Did you mean Disqus? So I guess unlike a lot of other users who have complained in comments to this story about Google's, you want a web search engine to correct your spelling.
Those scumbags need to burn in hell
Could you explain why Disqus are more "scumbags" than other comment section hosts? Or could you explain why all comment section hosts are "scumbags"?