Incorporating Machine Learning into Firefox 2.0?
blakeross asks: "I will be doing research this summer at Stanford with Professor Andrew Ng about how we can incorporate machine learning into Firefox. As we work to finish up Firefox 1.0, we're also seeking ideas that will make Firefox 2.0 blow every other browser out of the water. People who come up with the best 3-5 ideas that involve the use of machine learning will win Gmail accounts, and if we implement your idea you'll be acknowledged in both our paper and in Firefox credits. Your idea will also be appreciated by the millions of people who use Firefox. We'll also entertain Thunderbird proposals. See my weblog post for more details; I'll read all comments posted in response to this story or to my weblog."
looks like you want a browser that will sort your pr0n for you ....
1) Make it faster
2) Please keep GTK+ 1.x support
"we're also seeking ideas that will make Firefox 2.0 blow every other browser out of the water."
The competition: Internet Explorer, Netscape, Lynx, and Safari.
I'd say it's already pretty much covered...
(love my FireFox)
"In a Democracy, people get the kind of government they deserve." -Winston Churchill
...a browser that doesn't have machine learning in it. Seriously, Firefox is slow enough for me. What on earth would you possibly need "machine learning" for in a web page browser? I'd immediately switch back to Opera (I don't use it simply because input forms lag during page-loading, some sort of multithreading issue).
That kind of automatic crap is the same sort of stuff people would bitch about if Microsoft put it into IE. I mean, do you really want your browser actually learning anything about you? Imagine the havoc it could wreak, especially if trojans started fucking around with it.
Just give me the leanest, meanest browser out there. That's all Firefox 2.0 needs to be. Not a damn learning machine. Sheesh.
1. Based on the user's browsing habits, automatically bookmark the most frequently visited sites, and automatically put them into *multiple* categories (not just one category) to make them easy to find.
No, don't do that. You think I want my favorite porn sites automatically bookmarked without me realizing it so my wife can see it and bitch me out?
OR how about at work when some site I look at while goofing off ends up in my favorites? Ya, I really need some sports wesbites showing up in my favorities at work when the project manager tries to find a link to some development site at my workstation...
Seriously, I would not want that feature at all. Not unless
1) you can shut it off:
or
2) the machine learning is REALLY good and knows not to put porn or sports websites into my favorites but does put websites with API documentation and technews...good luck with that one.
same with middle click scroll, have a transparent gray line where the top of the page was when middle click scroll was clicked!
Does anyone else get the feeling that they are adding this just for the sake of it or so they can say they have it? I mean when you have the technology before any useful uses for it then clearly there is something wrong.
I think that creating a good browser though gimmicks is a poor long term strategy and seriously doubt this route will turn up anything useful. Ideas should be so simple and obvious and inspire us to say 'who dont we have that already?!' not something we search for!
Which browsers are you talking about? Are you thinking about the lame ass implementation inf firefox that searches within bookmark titles and URLs? Bookmark search should actually search on bookmarked pages themselves. Nothing less will do.
Your pizza just the way you ought to have it.
Whoa! Good call! That's an awesome, basic feature that could easily be added to FireFox without bogging it down.
(So many of the other suggestions so far would make FireFox slow to a crawl. Lets keep it lean and mean, please!)
Nice idea. But you would only want it to happen when there is a large section of text that goes off the bottom the screen. Part of the learning for this feature would have to be to recognize when this is the case. ie you would not want to to happen is you are scrolling down to look at a picture or if you are already at the bottom of the text.
Isn't this going about things backwards a little?
To me this sounds like a clear case of "technology X is really cool. Let's find some reason to include it in product Y." Which often means that product Y becomes much more complicated than it needs to be.
How about first looking for a list of browser "needs" so to speak. What would make the best browser? What current deficiencies to browsers have? And so on. Then, if you really want to, try to figure out if any of these problems could be solved with machine learning.
Don't just inject a technology into a product because it's cool. Make sure there's a real need for it.
Who said Freedom was Fair?
Make it an extension only!
Seriously, it would be a really neat feature if some of the suggestions posted here were realized... but this whole idea screams of bloat bloat bloat. What makes FireFox so appealing for some (including me) is it's compactness and lack of bells and whistles. The FireFox project FAQ echos these sentiments: It's small, fast, simplified, nothing other than what you need. "Just a browser"
Don't let feature creep ruin it!
=Smidge=
The diff is that the pages I had already bookmarked ARE relevant to ME. And it's been countless times that I found a really good page only to forget its url and struggle to get it out of google again. Searchable bookmarks are orthogonal to Google. we need both.
Your pizza just the way you ought to have it.
I hate it when anything software tries to "predict". I don't want it. Please make sure it has an OFF button. Seriously. Thank you.
I -love- this idea! I've suffered from the same problem for years, especially in long articles like those to which Slashdot frequently links, and never bothered thinking about how to fix it-- but this idea is a WINNER!
Yeah yeah, I know, I'm not really adding anything to the value of this thread-- but i wanted undertow3886 to know how much I like the way he (she?) thinks.
Allegedly real newspaper headline from 1998:
Man Struck by Lightning Faces Battery Charge
Do NOT bloat the browser.
Want to add crap? PLUGINS!
Hate me!
...Is to make it very easy to turn whatever machine learning features incorperated into 2.0 off totally, with minimum fuss and searching.
It is my firm belief that then #1 rule of UI design is that the program should should look and act consistant. And the number two rule is that the program should never assume anything, or perform any action without the user explicitly telling it to (barring sane default behaviors that will fit > 85% of the users). Every ML feature I have ever seen breaks #1 and #2 with reckless abandon by changing something to make it more 'friendly', which in turns makes it less friendly because I don't know _exactly_ what to expect from my program.
Looking at the comments on that weblog, I can not find a single idea that does not either violate my top two rules, or would otherwise annoy me to no end. If they have to add that to Firefox then please, let me turn that crap off in three mouse clicks or less.
This is the only suggestion so far that really seems worth making the browser larger (and hence, slower).
You have enemies? Good. That means you've stood up for something, sometime in your life. --Winston Churchill
Here's an example:
Bayesian filtering
Thunderbird wouldn't be the same without it. Does it drag your system to a halt? Nope.
I'd be awfully surprised if anything real CPU intensive would ever be installed into Firefox by default. Give these guys some credit.
Ironically, the word ironically is often used incorrectly.
see, that's what I meant about explaining what you meant. You say browsers have never had the feature of "searchable bookmarks", but browsers do have that feature, just not the feature you're now describing.
I knew that's what you probably meant, but you could have also been blind or something.
-- 'The' Lord and Master Bitman On High, Master Of All
Adding what passes for "machine learning" to a user interface usually results in something that does the right thing some of the time, the wrong thing some of the time, and you can't figure out why.
Bayesian spam filtering is becoming like that. At first it worked, but it's breaking down under the rising percentage spam.
If this was extended to also search pages in your history ("What was that really good site about fixing roof leaks I found 2 days ago?") it would be even handier. I don't always have the foresight to bookmark every useful page I stumble upon, but I have my history set to go back 2 weeks (I really don't care if someone sees what sites I've been to, I'm not ashamed :)
DJ kRYPT's Free MP3s!
Have it learn where I am saving which files and offer up that directory as default. If I am saving all pictures into one directory and all movies into another it should know that.
I want virtual folders in my mail. These are "live" queries like "all mail today" or "all mail marked urgent". As I mark metadate on the email they will show up in the proper virtual folder.
Full text search of all email.
Choice of multiple home pages. It learns when I want my home pageX and homepage Y.
Roaming bookmarks!!!. While I am at it roaming everything including profiles and preferences. The ability to carry my email filters from location to location would be awsome.
A network install where the administrator can set global prefs and install global plugins. I also want the option to override the users preferences and lock them out of certain setting.
It should learn to adjust my font size (and other settings?) based on site. If a web site always puts tiny print then I want the fonts larger only for that site. Perhaps have it learn "ugly" sites and put my default styles instead.
Auto proxy. I want to feed a list of proxy servers and have it switch randomly (even from one site to another). Think of this as super privacy.
Ability to arbitrarily morph the the incoming text stream using regexp or javascript. This would allow me to roll my own weird crap.
Make XUL 50 times better. Make it so it's trivial to use XUL to make database front ends. Give me a great GUI builder for it.
I have lots more ideas but that's enough for now.
evil is as evil does
All good ideas and I am sure that many people will come with some other good ones but please, the most important is to give the option to easily Turn Them Off! For example I would like to be able to turn on and of JavaScript from a button on the browser. The same way it would be nice to be able to customize a toolbar where you had an on off buttons for those features that I maybe don't want to use all the time.
Yahh, hiii haaaaa! -Major Kong, from Dr. Strangelove
Extension. It's why that framework exists.
I don't want my browser learning, tracking, filtering, bookmaking, or otherwise doing anything with any data other than exactly what I tell it and I don't want it asking me if I'd like to do something, as if I didn't know. This also includes storing, caching or anything else.
Am I paranoid? Maybe, or maybe with all of the privacy invasion from big brother these days I'd like a little control.
I'm sure our clever Mozilla developers would give us a way to turn off any "advancements". Wasn't firefox supposed to be lightweight anyway?
If you're going to make any options that store, learn, process, remember, filter or otherwise monger after my data, don't turn it on by default, it sounds like a security bungle, or at least abuse.
I had a lot of material I was going to post, but I pulled it out and will likely send it offline.
One of the things I think people place too much emphasis upon is the "Mine's Bigger!" syndrome. This happens in far too many facets of the workworld: the sweeps for the local news, when they've pulled out all the stops to find the juiciest stories which will make the others envious.
When I wrote most of my initial message, there were a bunch of messages which applied primarly to formatting or things which would be kewl to power users and geeks. That leaves a lot of people out. Send out the best new models for each critic. "This one is really cool and it's got every feature unknown to man, but unless you're Steven Jobs, forget it." Compared with, "You know, the features and options are not quite what the others are, but anyone can use it. Now, which one of those projects would you want to work on? I can guarantee the sales of the latter would be far more than the previous.
Probably one of the biggest things overlooked in the browser et. al market is in the searching mechanism. Unfortunately, far too many services which provide search mechanism have a "mine's bigger" syndrome when it comes to speed. "We serve up 100M requests a day with a minium return of x time.
Who cares? Most of the time a browser or any other search mechanism is involved, the statistics should be on how fast the user is able to find their desired search, not the search they have to enter with an improperly designed interface. Take Google.com with an advanced search. First, that presumes, I'm presenting myself as an advanted user. That means I should be presented with an advanced screen. If that's advanced, there needs to be another level beyond that. Most of the time I have searches, I have to supply dunsel searches, then hack the supplied values in the supplied text box with the results of the text, then rerunning it.
I've got more to say on my soapbox, but I'll send it offlist.
Bottom Line: "Smarter" searching, not only for the occasional searches but for those who are labelled "giants among ants" (which) I am not, by any stretch of the imagination - I just knew I'd have a chance to use that phrase.
Don't ask me if I want to remember a username/password combo until AFTER the login has been successful.
Spoon not. Fork, or fork not. There is no spoon.
Firefox needs an option to make the browser detect, and work around, user-interface abnormalities in poorly-designed websites.
It's fairly well-established that the best user interfaces are the ones where there is no discrepancy between what the user thinks is going to happen, and what actually happens.
When a user single-clicks a link, the link should open in the current window. Always. Any other behaviour (such as opening a new window) causes the user to be frustrated (or at least slowed down).
Similarly, when the user middle-clicks a link (or shift-clicks or whatever), the link should always open in a new window/tab. No oddities like "javascript:gotosite()" or "http://path/to/exact-same-page.html#" should happen.
Unfortunately, there are a lot of misguided website authors that think they're being helpful by doing non-standard things in an attempt to anticipate users' needs. This means that you'd need some type of machine-learning in order to work around these problems at the browser level.
I imagine this would be done in a way similar to how SpamAssassin works.
Many pages are cluttered with navigational junk and ads that detract from the interesting content. Take a look at www.cnn.com, for example. The story text is in the middle, and that's what I'm interested in, but all the buttons, ads, additional information, etc. takes up a lot of space.
Automatically identifying the main content of a page, and fading everything else out a bit would be very helpful.
Some sites take an article and break it up into several pages. It would be useful to automatically recognize that, fetch the continuation pages for the article, and pull the relevent content back into the original page.
I frequently adjust different aspects of my browser for different sites. Adjusting the window size/position, bump up font size by 10%, allow/block images, whatever.
I'd like a system that remembers those adjustments, and not only reuses them when I return to the same site, but applies them again where appropriate. 'Where appropriate' is where machine learning comes in.
plus-good, double-plus-good
You mean more new features.
Forget new features, just fix the bugs. There are bugs (some inherited from Mozilla) that make Firefox unusable on some Linux systems. If you want ideas for what to work on, go to Mozilla's bug list.
This would be about 1000 times more useful than putting in yet more code bloat which will introduce yet more bugs. Of course, it won't gratify your ego as much. It's a question of what your goal is - accomplish something useful for the community, or pump up your ego.
All this talk about machine learning is great, but I would absolutely love to have ONE button that will quickly "pause machine learning, cookie enabling, disk caching, history logging, and whatever else". ie, a "privacy button".
A lot of times, I don't care to have my actions logged forever, but at the same time, I don't want to have to go through all settings and change them manually, or completely nuke all my bookmarks, cookies, and disk cache from the last few months.
When would I not want my actions logged forever? I can just see posts joking about pr0n headed this way, but in all seriousness:
- looking for another job during lunch at work
- searching for a surprise gift/vacation for the gf/wife while at home
- borrowing a friend's browser for a few minutes to do some on-line banking
- etc...
My 2cents...
Have we already forgotten Microsoft's Clippy?
Please no!! This is exactly the thing I hate. The key words here are *TRY TO* serve you better -- those "smart" menus almost always guess wrong!!
I don't want a program that tries to guess what I want to do next -- I know what I want to do and I want a program that stays out of my way and lets me do it.
Firefox totally rocks (except for the really really stupid name, but that's another issue) and it totally blows away evey other browser, despite the fact that it hasn't even reached v1.0 yet.
Please don't screw it up.
thats in reference to modifying currently executing code in tight loops. that used to be a common way to avoid branching back in the day.. nowadays if you have to use a conditional jump in a tight loop its usually not faster to try and work around it.
lately the term "self modifying code" is commonly attributed to dynamic code generation (it does sound cooler), but dynamic code generation is still the best way to accomplish many things and nothing intel says about "self modifying code" applies to dynamic code generation techniques.
anyways, what your parent is speaking of is neither of these things. loading only the code that is required is a technique known as "late binding" and is a great way to modularize an otherwise bloated application.. I think firefox is already on this path with its extensions. hopefully they remove more extension-like features from the main app and implement those features in extensions, perhaps ones that are installed by default.
bite my glorious golden ass.
That would be kind of useful, if I wasn't currently drifting away from browser-handled bookmarks and instead using a customized start page using CSS-generated menus, in which you can fit anything from links to the Google search <form>.
hmm, that would be a nice feature - a start page generated from your bookmark folders, utilizing meta-bookmarks (which are in fact HTML snippets). Add customizable CSS and a name like about:start and I'd be sold.
And if you want to cram a learning algorithm into that, make the code that generates the start page sort the folders and bookmarks by how often you use them.
USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
Switch back to the previous tab when closing the current one instead of just switching to the rightmost tab.
:/
A bit annoying as it is now
I think that the existing "Minimum font size" pref is a better (not to mention cleaner) way of solving the same problem.