The mini allows for 100,000 documents/URLs to be stored in a collection, and AnandTech contains approximately 40,000 articles, news and blog entries.
When we first set up the Mini, we told it to start in each of the website's sections (for example, http://www.anandtech.com/it/) and in the web news area. The Mini considers any unique URL string to be a unique document, which makes sense (but is a bit surprising the first time that you run an index).
After four hours of indexing, the Mini had managed to reach its document limit and we had to improvise... A word to the wise: don't let the Mini crawl your entire site without keeping a close eye on it.
In other words, spidering the entire site led to the Mini wasting space on stuff other than the ~40k articles they really wanted indexed and running into its 100k limit.
your problem is similar to DMCA proponents
on
Defeating Captcha
·
· Score: 1
you say, "textual tests would do just as well."
DMCA advocates say, "content protection can stop bad guys without inconveniencing good guys."
Both are flat-out wrong in the real world.
The good news is, you have a potentially bright career ahead of you in politics.
If you're "running like 20 wiki and 5 custom web apps and a few WordPress installations" on your server then you shouldn't be intimidated by the 2 or 3 lines it takes to forward requests to the CherryPy server.
Python gives you the power and expressiveness of perl, but with actual design principles behind it.
Perl vs Python is sort of like MySQL vs PostgreSQL: each of the former was once useful, but now there's an alternative that is so much better, why handicap yourself?/one of many former perl users since moved to python
Wow. Isn't Monday morning a bit early to be hitting the crack pipe that hard?
Sample "in-depth" response for those who didn't RTFA:
How does eBay weed out unscrupulous sellers on your site?
MacGibbon: We have zero tolerance for wrongdoing and are committed to making eBay as safe as possible for our members. We also work closely with law enforcement agencies to help them to bring offenders to justice.
If you're a typical corporate CTO, you know that you will have no shortage of either Java or.NET developers for the next 10 years or so, which means those are going to be the most important candidates for just about any project you're considering.
Nobody writes Java with Vim or even Emacs JDE for very long. The productivity increase a real IDE like Eclipse provides is phenomenal. System.out.println is 6 or so characters of typing with code completion. (But the real win is being able to do things like say "show me all the places my constructor is invoked." Text searching isn't good enough when you have a million-line project.)
Maybe it's because having to save all your work, rebooting, rebooting again when your game is done, and restoring all your applications to the right state is a HUGE WASTE OF TIME.
Right now, for instance, I have 12 applications open, only a few of which have entirely satisfactory auto-restore-after-shutdown functionality.
you say, "textual tests would do just as well."
DMCA advocates say, "content protection can stop bad guys without inconveniencing good guys."
Both are flat-out wrong in the real world.
The good news is, you have a potentially bright career ahead of you in politics.
paragraphs.
:)
paragraphs are good.
(capitalization is optional, however.
If you're "running like 20 wiki and 5 custom web apps and a few WordPress installations" on your server then you shouldn't be intimidated by the 2 or 3 lines it takes to forward requests to the CherryPy server.
Get a grip.
find lcc/src | xargs dos2unix
that's all you need
but just about everyone refers to that book as, well, The Gang of Four.
it's like the Dragon Book, but that's probably before your time.
sorry to burst your bubble...
Python gives you the power and expressiveness of perl, but with actual design principles behind it.
/one of many former perl users since moved to python
Perl vs Python is sort of like MySQL vs PostgreSQL: each of the former was once useful, but now there's an alternative that is so much better, why handicap yourself?
"just like the PS2 HDD"
nobody developed for it because nobody had it, and nobody bought it because nobody developped for it.
it's possible that 360+HDD will be worth developing for if MS can get enough people to buy it early enough... good luck.
so, for someone with no BT experience --
care to give idiotproof instructions to seed, say, debian?
Sample "in-depth" response for those who didn't RTFA:
no, it's not by aol (nee naviserver; renamed when aol bought it)
...and obviously it scales like crazy; aol eats their dogfood
it's quite slick and doesn't have many if any of the problems listed in TFA
What would that be?
those firefox bastards, always ripping off opera...
on a more serious note, didn't omniweb have tabbed browsing blah blah blah before Opera?
Jython's competition is Groovy. Groovy doesn't even have a stable DESIGN, let alone implementation.
How long were /. types calling OpenSolaris vaporware? :P You of all people should know that lack of public release does not equate lack of progress.
At least with Jython you can check the cvs log and see that commits are indeed happening.
If you're a typical corporate CTO, you know that you will have no shortage of either Java or .NET developers for the next 10 years or so, which means those are going to be the most important candidates for just about any project you're considering.
Nobody writes Java with Vim or even Emacs JDE for very long. The productivity increase a real IDE like Eclipse provides is phenomenal. System.out.println is 6 or so characters of typing with code completion. (But the real win is being able to do things like say "show me all the places my constructor is invoked." Text searching isn't good enough when you have a million-line project.)
Jython has been stable for years now, and is a much better-designed language than groovy appears likely to become. Where'sthe love?
Oh, man.
Comparing Trails to a real lightweight kit says "I'm a fanboy and I don't know what I'm talking about."
Bye.
is only light relative to even heavier Java solutions. :-|
...)
Invariably people who sing the JSP praises have no significant experience with a real lightweight toolkit (Spyce, CherryPy, RoR,
But that's okay, because doing things the hard way builds testosterone.
Which is sad, because as much as PHP sucks, J2EE solutions suck just as badly in different ways. (That's another article.)
Because so much of the console market is just dying to spend an extra $500 for a PPE, after the $500 for a good PC GPU.
As a PS2 and XBox owner (but not a PC gaming rig), let me say: no, thanks.
Maybe it's because having to save all your work, rebooting, rebooting again when your game is done, and restoring all your applications to the right state is a HUGE WASTE OF TIME.
Right now, for instance, I have 12 applications open, only a few of which have entirely satisfactory auto-restore-after-shutdown functionality.
I've also gotten the "sorry, the scammer's account has no funds, so you'll just have to suck it up" line.