Although it lets you set what to index and what not to index, the indexer starts immediately as soon as you install the software, thus not giving you the chance to exclude certain files and directories from getting indexed.
I'm familiar with the Twisted Framework (TF), and was about to ask: does this book cover TF? TF is quite nice, but if I remember correctly, it's quite a bit piece of software, and has a steep learning curve. Does this book go over TF or just 'raw' network programming with Py?
Neither Furl nor Delicious nor Spurl really full-text index your bookmarks, and I find that to be a MAJOR minus in their service. Simpy, on the other hand, crawls and re-crawls pages you bookmarked, which lets you make full-text searches against your own and other users' indices, a la Google.
Java C# porting - Lucene as example
on
Java 1.5 vs C#
·
· Score: 4, Interesting
It is this similarity and 'compatibility' of Java & C# that is now making it easy to port various applications between the two languages. For instance, the very popular Lucene (Information Retrieval library from Jakarta (i.e. Java)) has a very solid.Net port written in C# called dotLucene. The Lucene -> dotLuene port is fairly automated, it appears, which allows developers of the.Net/C# port to keep up with the original software written in Java.
If C#/Java continue in this direction, I think we will see many more applications that have parallel versions in the two languages.
Yes, except they never did anything with Mozilla. I never understood what they were smoking. Now that I see they are even releasing an IE-based browser I know for sure I don't want any of that thing they are smoking. Baaaad.
Firefox & Co. are coming back, and that software is indeed technically superior to IE. However, Mozilla foundation still misses one crucial piece of the puzzle: a distribution channel. Until somebody with a big distribution channel jumps in and helps Mozilla, my web server access log will continue showing Mozilla user base growth of less than 1%/month/year.
That is where GBrowser comes into play. Google has a massive distribution channel that knows no OS boundaries.
This is only somewhat true. While I have been reading all Word documents with OpenOffice (OO) for the past 2 years or so, I often run into Word features not supported by OO. For instance, I recently received a password-protected Word document that I could not open with OO. I had to use AbiWord (how come the report doesn't mention that!?). Another missing feature seems to be the ability to view Word document changes when the original document has 'track changes' turned on.
I guess reports like this one help larger, less up-to-speed corporate users by opening their eyes and mind.
Let's not forget that Linux didn't have the volume, either. Google didn't have it either. Rarely does anything have volume when it's young. Quantity (volume) is not the only factor. There is something to be said about quality, too!:)
Have you seen Otac na sluzbenom putu (When Father was Away on Business)? There is a child character called Malik in this film, and he likes to walk around Sarajevo in his sleep. After his mom and his slightly older brother found out about his sleep-walking habit, they tied a rope around his big toe, and put a little bell on the other end of the rope. Looked like a good monitoring solution to me!
Do not underestimate the power of human greed. Think about how many of them will consider this to be just their _first_ million. They are not going anywhere any time soon.
Plus, $1M - ~40% tax = $600,000.
Plus, $1M is really not all that much nowadays - if you have 2 kids in the U.S. and send them to a private college that will set you back at least 2 x 4 x $35,000.
I think search IS the killer feature. However, just trying to be _better_ than Google is not going to do it. The company/service that will succeed in being better than Google will also have to come up with some alternative ideas and approaches to solving the same problem (finding a needle in a haystack). One of the approaches is to create a service based on humans instead of crawlers - and before you say it - I am not talking Yahoo or DMOZ-like directories and such. I'm talking what some people are calling 'folksonomies'. Instead of describing and explaining, I'll just point you to a demo account of one such service. Of course, this is just one of the alternative approaches, and this alone will not beat Google (at best, it will intrigue Google or Google-like companies to either copy, steal, or acquire). But, it's a start of thinking in a somewhat new direction, in my opinion.
It is interesting to note that there is/was another CMU project that carries a similar name: WebSPINX (http://www.cs.cmu.edu/~rcm/websphinx/). It is also written in Java, but is not related to speech recognition - it's a small web crawler. Does anyone know why CMU projects like to use SPINX for their names?
I agree with this! I hate managing my links (aka bookmarks aka favorites aka the stuff in your browser that you can never find when you need something and never know which folder to file it under). However, I like having my own personal search engine that lets me full-text search the contents of bookmarked web pages. That's what I use Simpy for (see sig). Smaller index, my stuff, my 'neighbourhood', etc. Try the demo and see for yourself (but be kind to this shared account). Long live search, death to hierarchies!
Although it lets you set what to index and what not to index, the indexer starts immediately as soon as you install the software, thus not giving you the chance to exclude certain files and directories from getting indexed.
I'm familiar with the Twisted Framework (TF), and was about to ask: does this book cover TF? TF is quite nice, but if I remember correctly, it's quite a bit piece of software, and has a steep learning curve. Does this book go over TF or just 'raw' network programming with Py?
Grab it while it's hot. Going fast!
Neither Furl nor Delicious nor Spurl really full-text index your bookmarks, and I find that to be a MAJOR minus in their service. Simpy, on the other hand, crawls and re-crawls pages you bookmarked, which lets you make full-text searches against your own and other users' indices, a la Google.
Simpy: Simpy.
Demo: demo account (shared, be nice)
It is this similarity and 'compatibility' of Java & C# that is now making it easy to port various applications between the two languages. For instance, the very popular Lucene (Information Retrieval library from Jakarta (i.e. Java)) has a very solid .Net port written in C# called dotLucene. The Lucene -> dotLuene port is fairly automated, it appears, which allows developers of the .Net/C# port to keep up with the original software written in Java.
If C#/Java continue in this direction, I think we will see many more applications that have parallel versions in the two languages.
See:
Lucene
dotLucene
Yes, except they never did anything with Mozilla. I never understood what they were smoking. Now that I see they are even releasing an IE-based browser I know for sure I don't want any of that thing they are smoking. Baaaad.
Now we know what kind of teeth Jaws will wear in the next James Bond flick - Carbon Nanotubes dentures.
Firefox & Co. are coming back, and that software is indeed technically superior to IE. However, Mozilla foundation still misses one crucial piece of the puzzle: a distribution channel. Until somebody with a big distribution channel jumps in and helps Mozilla, my web server access log will continue showing Mozilla user base growth of less than 1%/month/year.
That is where GBrowser comes into play. Google has a massive distribution channel that knows no OS boundaries.
unit tests
not a panacea, but it does go far.
http://webcrawler.com
This is only somewhat true.
While I have been reading all Word documents with OpenOffice (OO) for the past 2 years or so, I often run into Word features not supported by OO. For instance, I recently received a password-protected Word document that I could not open with OO. I had to use AbiWord (how come the report doesn't mention that!?).
Another missing feature seems to be the ability to view Word document changes when the original document has 'track changes' turned on.
I guess reports like this one help larger, less up-to-speed corporate users by opening their eyes and mind.
But you have to give it to Microsoft, too. They are very successful at continuous pushing of large volumes of DOS consoles with a new UI.
Let's not forget that Linux didn't have the volume, either. Google didn't have it either. Rarely does anything have volume when it's young. Quantity (volume) is not the only factor. There is something to be said about quality, too! :)
Have you seen Otac na sluzbenom putu (When Father was Away on Business)? There is a child character called Malik in this film, and he likes to walk around Sarajevo in his sleep. After his mom and his slightly older brother found out about his sleep-walking habit, they tied a rope around his big toe, and put a little bell on the other end of the rope. Looked like a good monitoring solution to me!
Do not underestimate the power of human greed. Think about how many of them will consider this to be just their _first_ million. They are not going anywhere any time soon.
Plus, $1M - ~40% tax = $600,000.
Plus, $1M is really not all that much nowadays - if you have 2 kids in the U.S. and send them to a private college that will set you back at least 2 x 4 x $35,000.
I think search IS the killer feature. However, just trying to be _better_ than Google is not going to do it. The company/service that will succeed in being better than Google will also have to come up with some alternative ideas and approaches to solving the same problem (finding a needle in a haystack). One of the approaches is to create a service based on humans instead of crawlers - and before you say it - I am not talking Yahoo or DMOZ-like directories and such. I'm talking what some people are calling 'folksonomies'. Instead of describing and explaining, I'll just point you to a demo account of one such service. Of course, this is just one of the alternative approaches, and this alone will not beat Google (at best, it will intrigue Google or Google-like companies to either copy, steal, or acquire). But, it's a start of thinking in a somewhat new direction, in my opinion.
To the person asking the original question: I would try going to people who deal with maps:
- MapQuest + all other companies that provide mapping services online
- National Geographic Society
- Maybe GPS companies (e.g. Garmin) can help
Maybe you can even contact publishers of atlases, they may have some hi-res maps that would help this child.
Yeah, and they are also building a clinic with a large emergency room near by.
Are they open-sourcing the $50 bill? Can we fork it?
It is interesting to note that there is/was another CMU project that carries a similar name: WebSPINX (http://www.cs.cmu.edu/~rcm/websphinx/). It is also written in Java, but is not related to speech recognition - it's a small web crawler. Does anyone know why CMU projects like to use SPINX for their names?
Some may find it interesting that Wikipedia (covered earlier today on Slashdot) uses some code that came out of LiveJournal for caching: memcached.
Add to that this graph, which shows that The Butler & Co. are not going anywhere, while Google looks like it ate a ton of Duracel batteries.
I agree with this! I hate managing my links (aka bookmarks aka favorites aka the stuff in your browser that you can never find when you need something and never know which folder to file it under). However, I like having my own personal search engine that lets me full-text search the contents of bookmarked web pages. That's what I use Simpy for (see sig). Smaller index, my stuff, my 'neighbourhood', etc. Try the demo and see for yourself (but be kind to this shared account). Long live search, death to hierarchies!
What, you tasked their butler?
gmsn.com?