harrisj · Slashdot Mirror

Re:A first in a new genre? on The Novel as Software · 2004-04-16 03:57 · Score: 5, Interesting

What I was hoping for from the title of the story was something like Galatea 2.2 by Richard Powers. To sum up part of the story there, a professor has a smart AI which drives an interface allowing the user to engage in realistic emails to literary characters. So, the user is able to figure out the story interactively and be part of their own epistolary work (not just read someone else's letters). Obviously, we aren't anywhere near that, and I guess the disappointment leaves me underwhelmed.

It seems like the innovation here is that instead of chapters, the user has days of the week they can click on to look at the formatted messages. And the vaunted interactivity is that the user can read the story out of sequence, not really in a nonlinear fiction sense (that can be hard), but really just in the same way I can skip forwards and backwards in a book if I want. Wow. I agree that while the interface is cute I suppose, the style really is more like a "game" version of a book. You might as well try interactive fiction instead.

Edge Management on Have You Personally Used an Honest Head Hunter? · 2003-10-01 08:31 · Score: 1

This is about five years old, so the information may no longer be good, but a recruiter at a place called Edge Management up in White Plains was good for me. They did not put pressure on me and checked in on me for the first few months to see if I was happy.

Car Sharing Services on Hybrid/Electric Vehicles: Should I Buy? · 2003-09-12 07:43 · Score: 1

I don't know how much you actually might need to use the car (ie, do you need it for daily commutes or just occaisional drives). If you don't need one that often, another transportation possibility to consider is a service like Zipcar or Flexcar or one of the other variations in cities around America where you can rent vehicles by the hour. I need a car here in New York only sometimes, and owning one is prohibitively expensive, so Zipcar has been a real bargain. Just so you know...

Careful what you say on What Would You Do With a New Form of Encryption? · 2002-10-09 05:57 · Score: 4, Interesting

From my somewhat scanty introduction to patent laws, you might want to be careful about how much you reveal about it before you file a patent or at least provisional paperwork. My company recently did work to patent a product and we were told we couldn't really discuss it with many people. Furthermore, doing an openly public action such as showing it at a trade show before applying the patent would seriously jeopardize the patent process. Now I'm not a lawyer or an expert in patent law, so I can't really say how valid an objection this is, but I'm sharing it here in case it's relevant. If it is correct, I want you to be able to decide whether to patent and not have it decided for you. (Any real experts have a better assessment).

Re:the fix-all? on David Brin on "Attack of the Clones" · 2002-09-18 06:03 · Score: 1

It's a neat little idea, but it's a plan with a pretty high body count (even if Yoda is seen as being motivated by good intentions). Given the fact that the Jedi will walk into a trap without thinking, surely there was an easier way?

Remote Controls Standard on Clothing Yourself In Technology · 2002-09-13 09:55 · Score: 1

Regardless of whether it's appropriate for the slopes, it's kinda cool having a controller on the sleeve that's more durable than the standard wired remote control provided with such projects (easier to be operated with gloves).

That's all fine and dandy, but are there any sort of standards for these wired remotes? I'm assuming that each manufacturer picks their own, which kinda sucks. Imagine you decide you want to use an Ipod or a RIO, etc. Your jacket pretty much becomes useless (unless there was some sort of swappable internal adapter). It might not even work with different products from the same manufacturer? Also, are these controls usually analog or digital signals? Anybody more clueful than I care to comment? Anybody really care (at least it's not the Nth post about how dangerous this is)?

Slacker Company? on Greenspun on Managing Software Engineers · 2000-11-06 02:40 · Score: 1

Well, it's nice to see that I'm working at a "slacker company", since we only demand about 40 hours a week. Sure, we have managed to do a lot with only a few programmers, but I suppose because we lack the amenities of a foosball table and large television, so we actually work those 40 hours.

I am glad that Greenspun provides more vacation time, but I abhor the attitude that the programmer should spend as much time at the company in the eventual hope of future reward. I know that he scorns "annual reviews", but I'm sure that many employees keep working for the even more intangible promise of stock options. It's always sad when those options become worthless. But we're in an industry that chews up young people only to replace them with H1B's when they get too old.

Some people in this discussion might find Extreme Programming interesting. It too tackles the problem of managing programmers but it comes up with some different proposals. You can check out the Wiki Version here.

Re:concentric on On the Reliability of DSL Providers... · 2000-09-27 01:54 · Score: 1

I've also had a good experience with Concentric here in NYC. Concentric is a national ISP, so you might want to check them out. Of course, they recently merged with another company and became XO Communications (bloody stupid name).

I went with Concentric because they were willing to give me 4 static IP addresses and didn't mind if I ran a server (connection is always on btw). Not all ISPs allow this, so you should check policy before you sign.

Problems with Inverted Indexing on Search Engines-Does Obscurity Prevent Exploitation? · 2000-09-13 21:07 · Score: 1

Most of the people here have already noted some of the problems with data collection that search engines face. Mainly, pages may have surreptitious content designed to fool search engines about their real agenda. Even Google could be spoofed by a dedicated foe, although it takes a bit more work.

While this is always a pressing concern for those people writing engines, there are other issues that might affect the accuracy of search engines. Mainly, there are certain limitations on the underlying technology, and other technologies are still in early development.

I think every major search engine uses Inverted Indexes to represent the data. The idea behind this is that you can think of every document as being made up of the following tuples: {docid, termid, position, fieldid}, where each doc has a unique id, each word in the lexicon has an id, and a fieldid can be used to indicate special fields like titles or meta tags. All the engines basically take this information and produce inverted indexes for searching which contain the following tuples: {termid, docid, position, fieldid}. Throw in some mapping tables, sorting, and some compression optimizations and you have the basic idea. When a search comes in, pull up the various document lists for each term, scan through them for matches (ones for each term), and return the best results.

This works well for large collections, but it has a few limitations. For one thing, it can only find documents by words that are in them, so relevant documents with related words are ignored (I think this is called polysymy). Also, you can have problems with synonymy (a search for "jaguar" could be a car, a team, etc.). In addition, the lexicon scales in the worst way, causing most indexes to limit the size of their lexicon, causing rarer words to be ignored. Finally, it can perform rather poorly for words with that appear often (try "market share" for example), since the term lists for these are large and require scanning through large amounts of disk. And mainly, all the search engines just tweak this model, but there might be better solutions out there.

Some research has been designed to tackle the scalability problem. k-Nearest-Neighbors works by performing a feature selection on the lexicon and pruning words that aren't really useful for searches (eg, "slashdot" is more significant than "the"). Some approaches can remove up to 98% of the lexicon without a significant loss in quality. Then documents can be represented as vectors of the remaining features and queries can be mapped into this space and the k-nearest neighbors are returned (eg, you calculate a dot-product). This scales in size nicely on disk, but you find yourself doing more vector comparisons as your collection size increases, so it's really only practical for smaller documents. It also requires that your searches contain one of the terms in the feature set, which can be a bit limiting.

Some research has been focused on capturing more of the "latent semantic" information on a document. Indeed, Latent Semantic Information (LSI) is the focus of much recent research. This technique works by feature selection and transforming documents into vectors in a semantic space that represent their semantic information (break out the linear algebra). Researchers claim that this allows you to find documents with related content, even if they don't match your terms. It also claims to solve the "jaguar" synonymy problem, but you really need to enter more than one term for it to work (and they have to be in the feature set too). While early research looks promising against test collections, its performance scales poorly and it doesn't work well against noisy collections like the web yet. But research continues.

Other research has been focused on natural language parsing in an attempt to recognize meaning of user queries and documents. This however is really tentative, and it hasn't been able to show some of the success of the less intelligent and more statistical methods like LSI or kNN.

I hope these rambling notes prove somewhat helpful. Interested slashdotters can probably find some useful primers on the web that explain this better than I can. Also, ACM members should definitely check out the SIGIR conference proceedings (the Digital Library is great).

Slashdot Mirror

User: harrisj

Comments · 9