Slashdot Mirror


Incorporating Machine Learning into Firefox 2.0?

blakeross asks: "I will be doing research this summer at Stanford with Professor Andrew Ng about how we can incorporate machine learning into Firefox. As we work to finish up Firefox 1.0, we're also seeking ideas that will make Firefox 2.0 blow every other browser out of the water. People who come up with the best 3-5 ideas that involve the use of machine learning will win Gmail accounts, and if we implement your idea you'll be acknowledged in both our paper and in Firefox credits. Your idea will also be appreciated by the millions of people who use Firefox. We'll also entertain Thunderbird proposals. See my weblog post for more details; I'll read all comments posted in response to this story or to my weblog."

11 of 806 comments (clear)

  1. The top five ideas by Anonymous Coward · · Score: 5, Interesting

    Here are the best five ideas incorporating machine learning:

    1. Based on the user's browsing habits, automatically bookmark the most frequently visited sites, and automatically put them into *multiple* categories (not just one category) to make them easy to find.

    2. Create a full-text index in real-time of every page that has been browsed. When the user visits any web page, display a sidebar of "Related previously-viewed pages."

    3. A Google-News-like consolidation feature for the user's most-frequently visited news site, automatically highlighting stories of interest based on ones they've previously viewed.

    4. Allow user to select "Fewer images like this" or "More images like this" or "Less text like this" and "More text like this" and using Bayesian or other similar filters, automatically block or highlight content. For blocking advertisements, or highlighting certain key passages.

    5. Allow the user to browse their own hard drive, and categorize content automatically ("this is a document about lambs" ... "this is a picture of a sunflower") and let them group and search for items. Eg. "Pictures like this" or "Documents about cats."

    Please give my Gmail accounts to Gmail for the troops.

  2. ideas by pangel83 · · Score: 5, Interesting

    - The pop-up management in modern browsers who provide this feature although more efficient than in the past is still not perfect. Adapt to what pop-ups a person normally uses

    - Content highlighting (especially in news sites). Learn what types of news articles / subjects a user is interested in, and highlight titles in news pages that suit the user.

    - Accelerator for narrowband connections. Predict which pages the user is more likely to visit next, and start loading them as the user still reads the previous page.

    - Recognise efficiently scam sites? Protect users from fraudsters?

    PS: Not machine learning, but the sole requirement by me for a browser (dunno if its done in firefox now as hvent used it for a long time): Open new tab as a default rather than a new window, or at least provide the option.

  3. idea by undertow3886 · · Score: 5, Interesting

    Make it so when the user hits the Page Down key, a horizontal line appears for a few seconds where the old bottom of the page was, then fades away. So when you're reading long sections of text and hit Page Down, your eye can quickly scan to where you left off.

    --
    Sick of people knocking on Gentoo's greatness in completely unrelated .sigs? Me too!
  4. Screw machine learning... by MSBob · · Score: 5, Interesting
    I've been waiting for searchable bookmarks for about a decade now and it is yet to appear in any web browser. Bookmarks as implemented in today's browsers are useless. They are unmanageable beyond twenty or so and the interfaces to keep them "organized" in "folders" are clumsy at best.

    There. Your most important feature that browsers never had. Searchable bookmarks. Doesn't get much simpler than that. Am I the only one who thinks it's something every browser should have had long time aog?

    --
    Your pizza just the way you ought to have it.
  5. Spyware Filter Integrated In Download Manager by dduardo · · Score: 5, Interesting

    The firefox download manager should scan downloads for malicious spyware, stop the bad download(s) and warn the user of the danger posed by the file(s).

  6. Make autocompletion more efficient by zwalters · · Score: 5, Interesting
    A moderately annoying, but extremely common procedure when I'm browsing is to have a specific destination in mind, say Baseball Primer http://www.baseballthinkfactory.org/files/primer/

    Now, because this has a lot of discussions, when I start typing basebal... I get a lot of urls in the autocompletion field like http://www.baseballthinkfactory.org/files/primer/o racle/

    or even unrelated baseball sites. So it's not uncommon for me to have to press downarrow several times. A very useful application of machine learning would be to order the autocompletion possibilities so that my average number of downarrow presses is minimized.

  7. Improve URL matching in the address bar by slobber · · Score: 5, Interesting

    Currently, if I start typing URL in the address bar, it matches URLs alphabetically. This gets very annoying at times, especially if you accidentally type giigle.com instead of google.com and then it keeps on matching giggle.com for weeks when I type "g".

    This problem can be fixed by using frequency count with some time decay. For example, if I went to google.com 100 times within last week and once to giggle.com, then match to google.com on "g". If, however, I went to giigle.com 5 times recently, then match to giigle.com

    While one might argue that this makes the algorithm unpredictable from user's standpoint, in my experience people keep on typing until they see the correct match. So, this way they'll see the right match sooner on average.

    --
    "You mortals are so obtuse." -Q
  8. Bookmark filtering in Firefox suggestion by talaphid · · Score: 5, Interesting

    Speaking of Bayesian filtering, some form of clever-er guessing as to where my next bookmark in my ecclectic collection of bookmarks goes. Sample relatively unique keywords in pages as bookmarked, weight towards bookmark folder baskets, bingo.

    Avoid more sophisticated algorhythms that infer a sorting methodology the same as the developer, however. Maybe I have a Programming folder which has C in it, and so you'd infer that all characteristics of matches to Programming inherit to C, if that's the sort of sorter you are, and that fits with you, me, and program-think, so that's right? Right? Except perhaps I'm a university student who has a University folder, and I'm studying Java, whose extrinsic attribute prioritizes sorting it into that group... so you'd end up with a word weighting argument between superclass Programming, which is wrong, and Java, which is right.

    Let me be clear. This suggests nothing at all about helping the user organize their bookmarks - everyone has their own system (although perhaps a Bayesian category guesser would be a separate fun feature). This suggestion is merely better guessing of first suggested folder when I CTRL-D.

  9. Recognize and Navigate Multi-Page Displays by MonkeyBoyo · · Score: 5, Interesting

    Often masses of information are broken into multi page presentations.

    Somewhere on the page you have buttons named things like Next, Previous, or Page: 1 2 3 4 5 6.

    There may be good design rules for positioning these elements but often they are not followed.

    I've found many instances where I have to scroll up or down just to find the Next button so that I can click it.

    It should be possible to learn for a given site (or sub-tree of a site) what the Next and Previous buttons are just from user behavior and the nearly identical layout of say page 2 to page 3. I think this could be done without parsing any of the html or gifs associated with the buttons.

    If Firefox could learn and extract multi-page navigation then these functions could be bound to buttons up on the menu bar, or assigned to keys, and the whole problem of scrolling to find a Next would go away.

  10. flash preference detection by shaitand · · Score: 5, Interesting

    There are three types of sites in the world:

    Those that use flash for ads
    Those that use flash for content
    Those that stay the hell away from flash

    Rightnow, Firefox doesn't have any way to tell the difference between 1 and 2. But I do, I can clearly see if it's an ad or not. On every flash ad give me the option to tell the browser it's good flash or bad flash and intelligently learn what sites ("sites" also being defined by study of the urls, if I say www.bob.com/~jimbo/whatever.htm and www.john.com/~jimbo/howie.htm and www.curly.com/~jimbo/marthastewart.html are bad it should figure out there is a commonality in the ~jimbo part and apply my preference) have bad flash and block flash content on those sites, instead presenting me with a button to load to allow that content to load.

    It should use a number of pieces of information, the url of the page, the url of the flash animation, the size of the animation, the name of the animation, the server the page is being served off of, etc.

  11. Re:A few ideas by Artifakt · · Score: 5, Interesting

    "1. Keep track of how users enlarge/reduce the font size: if sites that use a 10 point font are repeatedly enlarged to 14 or 16 point then it is fairly safe to assume that the user has poor eyesight and all sites with tiny text should automatically be sized up."

    This is a good concept in several ways.
    First, what most people with eyesight limitations do is adjust the really severe problem text and put up with the less severe sorts, so if they enlarge 10 point to 16 consistently, they enlarge 12 to 16 only late in a browsing session, and just put up with 14 point type even though it's a bit smaller than optimum for them. People will go to an effort only when the threshold of discomfort is crossed and the problem gets their consious attention, and many people will put up with a problem beyond that.
    Second, it's a clearly quantifiable area, making it the sort of thing machines can excel at. If it turns out to have unexpected complexities, we will get a warning about how much worse other tasks, such as adjusting web sites based on the user's color preference or aestetic criteria, will be (no plaid backgrounds)

    --
    Who is John Cabal?