Boxxet, a Tool for Automatic Webpage Generation
tkajstura writes "New Scientist is reporting on 'a new tool [called Boxxet that] offers to create websites on any subject, allowing web surfers to sit back, relax and watch a virtual space automatically fill up with relevant news stories, blog posts, maps and photos.' It uses an algorithm based on unique word count to filter an index and integrate relevant subject information into the page, called a 'Boxxet.' The tool will first be available by invitation only, opening to the general public by the end of April 2006."
Now that we are finally rid of geocities pages some new shit service comes along.
So you say you wanna be a blogger, but you're just too darn lazy? No problem!
Paleotechnologist and connoisseur of pretty shiny things.
explains slashdot articles!
Hope the got that dupe bug fixed
__
Sigs are like arse-holes, everybody has one
So.... I'm a cool person, how do I go about getting invited? Anyone in /. have enough sway to hook a geeker up.
023AD01("Child", "Evil");
I'd be interested to hear from users how well this thing works. Is it powerful enough to be useful? If so, cool!
Any experiences here?
"The tool will first be available by invitation only, opening to the general public by the end of April 2006."
Subscribers can see the random crap early!
Does anyone have an example page that is a result of this alogrithm? The article is a little sparse on details or functionality, and you can't see anything if you go to the website.
From what I've read, I've tried to come up with stuff that I'd put in the first 5 links to give to the site, and I'm having trouble. I don't necessarily like to view the same things or same types of things from day to day, so I'm not sure how useful that'd be...
I can just see this program being used to "create" content to push more advertising. Just what we need more of, websites that have recycled content put online for ad revenue.
Accentuate the positive, don't waste your mod points on the negative.
Isn't that what a search engine does? You type in a phrase and it finds things like that and sends you a web page?
...I only read the best webpages generated by algorithms which suggest what I might find interesting...
Judges and senates have been bought for gold; Esteem and love were never to be sold.
and a nightmare for search engines. Hopefully there will be a way to detect boxxet pages and purge them, or at least show them seperately from relative content. Going from a search result link to another link full of partial information will be frustrating for many users and only benefits those who are makign aliving off of google ads, affiliates, etc.
'mmmmmmmmm.... forbidden donut'
How long until someone (i.e. everyone) figures out how to fool the algorithm and exploit the system so that their blog posts show up every single day on the front page of the "Boxxet"? Unique word count has got to be the most naive algorithm out there. Remember in the nineties when every web page had a list of three thousand keywords at the very bottom of the page to fool the search engines of the time?
There are 2 kinds of people in this world. Those that can keep their train of thought,
KBBL Boss: This is the DJ 3000. It plays CDs automatically, and it has three distinct varieties of inane chatter.
[presses a button]
DJ 3000: Hey, hey. How about that weather out there?
Woah! _That_ was the caller from hell.
Well, hot dog! We have a weiner.
Bill: Man, that thing's great!
Marty: _Don't_ praise the machine!
KBBL Boss: If you don't get that kid an elephant by tomorrow, the DJ 3000 gets your job.
[Marty punches it]
DJ 3000: Those clowns in congress did it again. What a bunch of clowns.
Bill: [laughs] How does it keep up with the news like that?
Boxxet will create the page for me, and then /. will read it for me. I don't need to get online at all!
Web 2.0 == Giant Blogspam Circle Jerk
This kind of tool might be nice for those people that are to lazy to either blog themselves or do some honest-to-god surfing, but can you really see publishers being thrilled that their content is going to be diluted and published on some Joe Q Random's Boxxet page?
Now, some bloggers and others might be happy to be republished verbatim outwith their control. That's fine. But most professional webmasters have a name for bots that go around taking content and putting it on other sites without permission*. The are called scrapers . The Boxxet bot and others like it are and will be banned by many webmasters (including myself) because the potential for abuse is too high.
There is also a name for such sites automatically produced by scrapers -- made for AdSense
* Note: There is no problem with sites that take headlines, write a summary/teaser and link back (like a certain site we are all very familiar with). These sites are doing a Good Thing(TM) for the content creators -- sending them an interested [ie targeted] audience. The problem for both the publishers and the search engines is the scraping. Only time will tell whether Boxxet is one of the troublemakers (cause the article and the site sure don't give many clues).
If all you have is a grenade, pretty soon every problem looks like a foxhole -- MightyYar
As the volume of recycled content goes up, the noise ratio will eventually be too much for anything too put up with. That's why I'm working on an automated web surfer so that this the recycled content can find some readership.
Agreed. We don't need more **junk** pages cluttering search results, and confusing my father-in-law. Stop the insanity!
PLEASE - no more of this crap!
I only hope that they took into consideration hackers trying to break into websites. I've been getting lately:
Drupal: Someone trying to see if I am running Drupal.
Mambo: Someone trying to see if I am running Mambo.
phpmyadmin: Same as above.
xmlrpc.php: Used (or it used to be used) by both Drupal and Mambo.
index.php and index2.php: Used by both Drupal and Mambo.
cmd.gif: Four different sites configured to help hackers deface your site.
and lots of others. So my input would be to run a test site annonymously as Boxxet and see if the hackers can breach the site before releasing it for people to use. Otherwise - it looks like it might be a nice kind of program to use.
PS to whoever is running Slashdot: The "Sections" area is doing some strange things and gave me an error once about SectionPrefs(???).
Someone put a black hole in my pocket and now I'm broke.
Its the same bull that you get when you type in a domain name in your browser to see if its taken and find a cybersquatted site with search engine material on it to appear that the page actually has some original content.
I also see this sort of thing everytime I do a search on a search engine like Google or Yahoo. I will get a result with the descriptor blurb appearing to have info that I am looking for. When I click on the link, I get sent to some cybersquatted 3rd party search results page that is full of ads that have my search term (which the ads usually aren't relevant to) highlighted in their descriptions.
DEAD DEAD DEAD DELETE ME
Now we'll have thousands of phony "news sources" like that, all linking to each other.
So now each search engine will have to develop an automated tool to find and ignore this dreck.
PORN porn porn XXX xxx xxx TITTIES titties tittes NAKED naked naked SLUTS sluts sluts
Wow- this workd count filter rocks!
when you see the word 'Linux', drink!
Do you see what I did there?
Taking something like
news.google.com -> Personalize -> Save Page as...
Except automated?
I guess sometimes the simple ideas are the best one.
Except when they're just dumb.
It would be interesting to see how much information Boxxet pulls off other sites and how it represents this as useful information without broaching copyrights.
The algorithm sounds like Dissociated Press to me.
Do you like this comment generated by my automatic slashdot comment generator? Do you like this comment? Viagra only 4.00 from here. Do you like this automatically generated comment. It can be filled up with any kind of content, hairy lobsters, automatic content. Do you like my automatic content generation. Bugs to smooth out. Beta version. Automatic slashdot comment generation. Only 4.00 with viagra. Well do you? Please come again.
"The White House is not an intelligence-gathering agency," -- Scott McClellan, Whitehouse spokesman.
Are you talking about Boxxet or Slashdot?
websites reliant on user-generated content such as blogs and collaborative bookmarking sites are facing a new problem: how to sustain a continuous flow of new information as the small pool of people that actually bother to post information gets spread increasingly thinly (emphasis mine)
He's got to be kidding me! The number of people that bother to post is shrinking?
Gathering content for you to puruse based on a string text... isn't that what Google does? better? Sheesh, and with Google I'm stuck searching for same crap day in and day out. Everyday can be completely new crap.
- I voted for Nintendo and against Bush
Sorry, but I just don't get the usefullness of this. I see the cool factor, but now how someone would put it to good use. Can someone suggest examples?
"Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
Very offtopic i know, but why does slashdot still use Y99 Dates?
4 3
http://slashdot.org/article.pl?sid=06/03/08/19482
06/03/08 instead of 2006/03/08
No, using 2006 in the url does not work.
Are they saving 2 charactors in the database or in the url?
Perhaps the assuption is that everything here (especially this comment) is worthless in the year 2100?
It seems odd that a programmers news site refer to Y99 dates.
Sorry I haven't jumped in earlier but here goes.
The New Scientist article didn't describe it as well as I would have liked. Think about a place like Slashdot, which is a great destination for tech information. We think that there ought to be similar places for many other subjects, whether it is a sports team, school, hobby, etc.
The problem with trying to support many subjects is that most subjects cannot produce a community as active as Slashdot. So Boxxet is trying to using automation to augment the user submissions and preferences.
Who knows, this thing may be totally not useful, but we're going to give it a shot.
We expect to open up invitations starting next week. We did not expect to get on Slashdot so our queue is higher than expected.
We will try not to disappoint.
You Mon Tsang
Now if we can just develop some sort of automated tool that obsessively scans a list of webpages for updates, leaving inane comments when it encounters a new piece of content, we can all finally leave behind the drudgery of the web and enjoy more free time.
When all you have is a hammer, everything looks like a skull.
The /. crowd doesn't like it. Learning from history, that means it will be a HIT with Joe Average and friends. :)
... until people start maintaining blogs based on 'boxxet' news stories....
this should be an interesting infinite loop.
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
I haven't RTFA but do they mention how they do it? Is it just a simple RSS aggregator with a few thousand feeds and then it filters the results? Something like that can be done in a day.
Another blood sucking RSS utility written by me: cribot.com (cut it some slack, this was done in a day or two.
So, I guess the real question is, Is Boxxet based on a good search engine? If not, I can see Grandma setting one up to gather topics related to caning and getting entries like Naughty Linda likes to have her big bottom turned red with a hairbrush. Do you want to help? If that doesn't induce a heart attack I'll eat a bug.
When will boxxet finally put Zonk out of a job? Surely /. could get better stories with an advanced computer program.
*tongue in cheek*
"The DJ "personalities" aren't the point; the music is."
Absolutely!
And thats why DJ's on non-commercial radio stations are still vital, because we don't have to follow a set playlist, and can generally play whatever we want to.
On my radio show, Music Out of Bounds, I mix together very good sets of new and older music, and I always get positive listener response.
Apparently, most slashdotters are unaware that non-commercial radio stations offer a viable alternative to commercial radio.
So clue up...
Radio isn't dead.
And some DJ's are for the greater good.
'nuff said.
????? a tool for automatic wellfare generation.
Emacs is good operating system, but it has one flaw: Its text editor could be better.
Now you can have a web site without doing anything...i guess you can feel very proud of doing nothing on your web site...
I can't imagine any wrong use for this technology...
It's never too late to stop doing something wrong, or to start doing something right.
Something created by a M$ employee. Farhan Ahmed: Is cribot a open source? Else, better not to promote at /.
People who go around breaching copyrights are not well known for broaching the problem of copyright infringement.