Slashdot Mirror


User: treerex

treerex's activity in the archive.

Stories
0
Comments
138
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 138

  1. No, Google shouldn't be worried on Microsoft Buys Search Engine, Going After Google? · · Score: 2, Informative

    Google isn't in the enterprise search space. Yes, they have the appliance, but that doesn't count. What FAST offers is a good product coupled with the professional services organization to integrate it into a business's workflow. The companies that Microsoft is now going head-to-head with are Endeca, Autonomy, Vivisimo, and their ilk.

  2. Re:Oh please on Microsoft [to patent] Verb Conjugation · · Score: 2, Interesting

    The issue I have with this patent application is that it doesn't even present a novel method for generating conjugation tables for a given verb form. The entire patent comes down to: lookup entered words in a table. From that table, link to this or that table, from there link to that or this table, ad nauseum. Everything is precomputed. They are patenting the brute-force, high-school freshman BASIC assignment version of this problem. Oh, it mentions possible UIs to display the disambiguation data. Big Whoop.

  3. Re:Don't delude yourselves on Google to Continue Storing Search Requests · · Score: 1

    They address this a bit in the readme that went out with the data: sanitizing the data corrupts the data, from a research perspective. And it is really difficult to do this adequately. Sure, you can scrub it with a regexp for SSNs or Phone numbers, but names? Using what name list? And what if the names being searched for aren't the name of the searcher, but someone else? That is valuable information. The point is you cannot do this easily, and never adequately.

    I suppose they could have released a subset of the data with some scrubbing, but someone would still complain. They could extract the query terms, rank them, and only return those that appear more than some threshold... that is useful information to a degree.

  4. Don't delude yourselves on Google to Continue Storing Search Requests · · Score: 4, Interesting

    Every search engine logs your queries. This is the way it is. If they tell you they don't log the queries, they're lying. The difference is that they don't make it available. In a previous life I worked with several search companies you've heard of on various search related technologies, and they *never* released query logs. Even cleansed the data were kept close to the chest. Queries are going to be logged with the IP address of the user. Some engines will track click-throughs on the results as well. That data is invaluable to a search engine.

    AOL's faux pas here was attaching personal information to the queries themselves: once that per-user identifier was attached all bets were off.

    If you are interested in working with query data, and do not work for a search company, you are shit out of luck, because you can't otherwise get this data. All of the research published on queries was done by Alta Vista, Google, Yahoo, Lycos, MSN... research on spelling correction of search queries is done by the same groups: they're the only ones with access to that data, until this AOL release (or older releases from other companies.)

    Having this data is a boon for researchers, but a net loss for people.

  5. Not a surprise, but remember... on No Virtual PC for Intel-based Macs · · Score: 1

    ... Microsoft acquired VirtualPC from a third party (Connectix? I can't remember.) They also have an Intel virtualization which could be used as a foundation for a Mac OS X Intel version. The statement that moving the Mac version to Intel would be a rewrite is undoubtedly true, but Microsoft could probably enter the market if they wanted. The issue is undoubtedly one of competition and egos. And between parallels and bootcamp an offering from MS here isn't necessary.

  6. Re:Matter of scale on The Business Model of Ubuntu · · Score: 1

    In my experience the RHEL support is not as speedy as that found in other FOSS projects, or in other Linux distros.

  7. Matter of scale on The Business Model of Ubuntu · · Score: 5, Insightful

    It seems to me that the reason Ubuntu (and other OS projects) can respond to user feedback and bug reports more quickly than larger (non-FOSS) companies is the relative sizes of the user communities. Compare the size of the Ubuntu install base to that of Windows (or Mac OS X, or...) and it becomes a no-brainer that you can respond more quickly. Don't get me wrong, I applaud the work the Ubuntu group does, but the ability to respond quickly will lesson as they grow. Compare with RedHat and its enterprise offerings.

    Just my US$0.02 worth.

  8. Flight sim? on Google Launches Online Spreadsheet System · · Score: 4, Funny

    So I wonder if Google inserted the obligatory flight simulator easter egg?

  9. Re:Why? on Ars Technica Reviews Intel iMacs · · Score: 1

    I would expect that Apple is putting pressure on Microsoft to get the next release of VirtualPC working on the MacIntel machines so that you get performance similar to that of the existing Windows version when running X86 code. In other words, instead of emulating the processor (as it does now on the PPC) it uses the native CPU as much as possible. Then you can boot XP or Win98 inside Virtual PC.

  10. Re:Real hackers use Python. on Larry Wall on Perl 6 · · Score: 1

    Apologies --- my response was written more in tone to the original parent who dismissed the language simply because of the use of whitespace. My point is that this is not a good reason to dismiss this, or any other, language. Comparing it to FORTRAN is disengenuous because Python does not require indentation to specific levels or specific tab stops. All Python requires is consistency within a given level. The examples of posting Python code to websites as a reason why whitespaces in Python are bad doesn't hold up.

    With regards to my statement of source as not being textual information, I should have been more clear. HTML was designed with support for natural language textual constructs: paragraphs, lists, enumerations, etc. that are displayed in a serif font (by default) and where consequtive whitespace is ignored on purpose. One would not use these constructs to represent a block of code, because whether it is Python, C, C#, Perl, or whatever, the formatting will be completely hosed. Instead you put this in a <pre> element, in which case formatting is preserved. Hence you get the results you want and expect. If someone misuses the markup, then you are SOL.

    The fact of the matter is that Python's "required" indentation is almost always (I am hard pressed to think of an exception) what you would expect to write clear and concise code. I find that Python does things the way I would expect it to: it often just works. The syntax is clean and easily described.

  11. Re:Real hackers use Python. on Larry Wall on Perl 6 · · Score: 1

    Well, for one: it makes communicating about the language difficult, e.g copying and pasting examples from html, web forums, e-mails, etc.

    If your conclusions about programming language are based primarily on how easily you can cut-n-paste code samples from rich-text email and HTML (outside of <pre> elements) then you need to step back and think about your career, IMHO. And it easy enough to reformat a code block if you have a decent editor: the Emacs Python mode makes such reformatting quite straight forward. The Python interpreter can't do this for you: indentation implies semantics, and you do not want the interpreter doing this.

    • html *ignores*
    • diff has an option to *ignore*
    • most editors have an option to *convert* (tab to spaces, etc.)

    HTML ignores whitespace because HTML was designed for exchanging textual information, not source code.

    Diff has an option to ignore whitespace, but by default it includes it. Since it is insignificant in many situations, ignoring it can be useful. This is not an argument that whitespace is ipso facto bad.

    Editors have an option to convert spaces to tabs (or vice versa) because in those editors you rarely can't tell the difference, and the mixture of tabs and spaces yield very ugly results when looking at plain text.

    Writing off a language for such a silly reason as its use of indentation for block structure is ignorant. Arguments that this screws up copying from your favorite website are specious.

  12. Re:Real hackers use Python. on Larry Wall on Perl 6 · · Score: 1

    Funny you should mention that. After meaning to for years, I finally got around to getting the "Learning Python" book... I made it to page 149 where it says "Python uses the indentation of statements under a header to group the statements in a nested block." I stopped reading and tossed the book on my bookshelf on a shelf full of unused & unloved technical manuals.

    This is such a naive and stupid statement. Avoiding a language because block structure is dealt with via indentation instead of {}s? People repeatedly use this as an argument against Python. "Oh, it's got a great library, excellent documentation, responsive community. Oh, it uses indentation for block structure? Never mind." Just dumb.

    What exactly is the problem?

  13. Re:What "performance issues"? on Pros and Cons of Garbage Collection? · · Score: 2, Insightful

    The reason is that most programmers tend to not realize that the free() operation actually takes up a decent amount of CPU cycles, and when you're freeing a bunch of little things all over the place, the overhead tends to add up.

    This depends entirely on the underlying memory manager. Using pooled allocation or other "zone-based" allocators can obviate the hit of these frees. As with many things, it's a tradeoff between the time spent putting a block back on its free list (naive implementation) to storing appropriate metadata with each allocated block to "deallocate" it in almost constant time. There is nothing magical about GC here.

    With a well-designed garbage collector, however, memory is freed all in one big chunk in a single go, and thereby decreasing that overhead.

    Sure, memory is freed in one chunk, but you forget the time spent finding unreferenced blocks and copying them (I assume, since you imply blocks are coallesced into one big block).

    The myth that garbage collection = poor performance is just that, a myth, and most likely started by people who associate Java's performance issues with garbage collection.

    Because those of us who have used Java since the mid-90s remember when the first JVM's GC sucked like a giant black whole. I remember at OOPSLA '99 Sun had their GC engineers walking around in garbage man's overalls to show that they were serious about improving GC performance in the language.

    GC performance issues have been around a lot longer than Java, by at least three decades.

  14. Re:Will it cost more than a Dell running Windows? on Intel PowerBook Rumor Mill · · Score: 1

    Point taken. Nevertheless Jobs will want to keep control of the total experience, which he can't do on Dell's crappy laptops. You still end up having to maintain drivers and compatibility across mutliple vendors and what not, and it just isn't worth it.

  15. Re:Will it cost more than a Dell running Windows? on Intel PowerBook Rumor Mill · · Score: 2, Informative

    Why does Apple still want to control the hardware? Why don't they just port to Intel and let vendors sell Intel machinces with licensed versions of Mac OS. It'll be cheaper.

    Because they then control the drivers and save themselves from the driver compatibility hell that Microsoft has been going through for years. One crappy driver reduces the "experience of Macintosh," and that is not something Jobs would want to do.

  16. Re:More OS X like integration... on What Does Open Source Need for Mainstream Desktop? · · Score: 1

    It's not power I want, but speed. I can configure my Linux system so that I can work quickly and efficiently. On OS X I really can't, and the default configuration has several things that slow me down. Also, just because you can do nearly everything in Linux with OS X, that does not translate into speed or efficiency.

    But you are not someone who is looking for a Desktop Linux, since chances are you would configure it how you want anyway. I'm sure there are many Linux users out there who do not have autoraise turned on: last I looked at Ubuntu it wasn't the default, for example.

    As the original poster said, what he wanted was Unix on the desktop, and as he said, OS X gives you this, and it gives it to you done right (IMHO).

  17. VPN routers, Wikis, and file servers on Software for a Virtual Office? · · Score: 2, Interesting

    I spend a majority of my week working from my home office, driving the 50 miles each way into the company's building only a couple of days a week. I have a VPN router (LinkSys RV042) that extends the corporate network into my house. Our team uses a wiki for tracking issues and such, and shared file servers work fine: the approved or cannonical versions of software are put on the server and everyone is expected to stay up-to-date.

    The previous reply to use rsync is a good idea if you want to automatically keep (force) everyone to the save versions of files and such.

    We haven't used anything like GForge, though we do not have a lot of remote development going on (a few engineers cross country, the rest on the same coast.) Adding another email system (for example) on top of wahtever the corporate email system provides is a waste and senselessly duplicative. Similarly integrating our RCS into another larger system didn't make sense.

  18. Re:Petabox.... on Building a Massive Single Volume Storage Solution? · · Score: 1

    Does not appear to be a single volume..

    That depends entirely on what software you run on top of the hardware, doesn't it.

  19. Petabox on Building a Massive Single Volume Storage Solution? · · Score: 0, Redundant

    Check out the Internet Archive's Petabox. They have a 100 TB rack running in Europe right now.

  20. Re:Now by "off-the-shelf components" do you mean.. on CIA Investing in Modular Green Energy · · Score: 1

    In-Q-Tel funding is public. The work that they fund isn't classified.

    Off-the-shelf means that, while probably not available at Home Depot or Lowes, components for the system are available on the OEM market and hence the final product does not require customized component engineering, with concomitant cost reductions.

  21. Re:Mixed Reactions on CIA Investing in Modular Green Energy · · Score: 1

    On the other... does it REALLY have to be the CIA?

    To be pedantic, In-Q-Tel is not a governmental agency, and while much of the funding it uses for VC moneys comes from the intelligence community, the CIA does not directly drive where the moneys are spent.

    Anyway, if you live in the US, you have no need to worry. :-)

  22. Cross-reference first: Doxygen is your friend on Reverse Engineering Large Software Projects? · · Score: 4, Informative

    It sounds like you are unable to build the complete system and run it, since you're missing functionality. This removes the possibility of using runtime tracing tools.

    The first thing I would do is run something like Doxygen over it to generate a cross-referenced description of the structures. It won't give you a global view of things, but it will give you a decent browsable view of the code itself. Another response mentioned GNU GLOBAL which may work better for you. Yet another possibility is LXR, though it may not work as well in C++. Regardless, a nice thing about Doxygen is that, when used with GraphViz, you can get useful diagrams generated showing class containment and file inclusion graphs.

    After you have that, get out your paper and pencil, and start drawing and manually tracing things. That's how I go about coming up to speed on new code I can't execute and step through. Eventually transfer that knowledge into a text file (or, nowadays, a wiki) so that others can benefit from it.

  23. Re:I think not on MySQL Moves to Prime Time · · Score: 1

    MySQL's client libraries are appauling. MyODBC, their ODBC connector, has been one big fuckup after another for the past 2 years.

    The horrible JDBC support with MySQL was the final straw for me, and what pushed me to PostgreSQL for all of my new database backed applications. The PostgreSQL JDBC drivers work really well, especially in the presence of Unicode text data. I've heard that more recent MySQL releases have better handling for this, but it's too late for me.

    Has anyone checked out the GUI admin tools? These are also a long chain of distasters.

    Amen. Whenever I've used MySQL the second thing I install is phpMyAdmin. It just works, and it just works well.

  24. Re:Reasons for using KDE/Gnome on OS X w/Finder on KDE Running on Mac OS X · · Score: 2, Informative

    Don't even get me started on the Finder's utterly, utterly useless "alt-tab" - what a pointless piece of crap. You simply _CANNOT_ switch windows with it, only applications!

    Others have pointed out Cmd-` to cycle windows within an application. There is also a third-party utility called Witch that allows you to switch to any window in any open application. It's what Cmd-Tab wants to be. Strongly recommended.

  25. Re:Don't use flags to indicate language on Multilingual Content Management Systems? · · Score: 4, Insightful

    If, however, the flags represent mere copies of the same site in different languages I think it's less of an issue. Americans, Australians, etc still speak English. French-Canadians, French(wo)man, Nuemeans (spelling??) still speak French...

    The English example isn't a good one: use the US flag and most everyone will know what languge you mean, though you still run the risk of alienating other non-American English speakers and the risk of further American imperialism. ;-)

    A bigger issue is Chinese: do you use the PRC flag for this? Congrats, you just seriously annoyed people in Taiwan. Use the Taiwanese flag? Good job, you've just incurred the wrath of the PRC Government. Hong Kong's flag? Confusing: now you're using the flag of a "special administrative region" of the PRC, but one that speaks Cantonese: are you including Cantonese characters in your site's localization (and, by extension, using the HKSCS character set?)

    The answer here is simple: don't use flags as an indicator of language. Instead use the name of the language in that language. Localizing for Finnish? Use "Suomi". Japanese? Use the kanji for nihonjo.

    The only time where it is arguably OK to use flags, is when you are using them to represent the country itself: if you have separate sites for the UK and the US, you can use the Union Jack and the Stars and Stripes: iTunes Music Store does this, for example.