Slashdot Mirror


User: MemRaven

MemRaven's activity in the archive.

Stories
0
Comments
193
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 193

  1. There are seminal research papers on this on Distributed Databases? · · Score: 2
    and they all come down to one thing: it can't be done very well, and we should all stop trying. It all got summed up by Jim Gray in a paper I can't find a link to right now.

    IBM had a distributed database project going on back in the System-R days, and they never really got it working. I worked on the Mariposa project at U.C. Berkeley which attempted to solve some of this problem, and it didn't really get that far beyond a data warehousing context. The problem of ensuring that replication along with ownership and transactional semantics were preserved just became too difficult to solve in a purely generic way.

    If you're just interested in high availability query processing, the Mariposa work is probably pretty relevant (a company called Cohera tried to commercialize it). If you're interested in distributed transactions, you've walked into the realm of Tuxedo (by BEA systems, caveat, a former employer). While specific instances of the problem CAN be solved, one general purpose system is going to have significant problems, so it's best to categorize what you're interested in solving.

    I highly recommend that you dive into the big Stonebraker/Hellerstein book on database system implementation research papers and start reading up. It's a VERY difficult problem. Hellerstein is part of a new project which is also trying to solve some of the problems in a different way.

  2. Re:why so many computer innovators gay? on Interview With Eric Allman And Kirk McKusick · · Score: 2
    There's a Fast Company article on this.. While it starts out by saying that the whole point is to look for gay centres, (again to get some big press and readership, methinks), the analysis is actually pretty common-sense.

    Gay men, in large numbers (to make a gross generalization), tend to be drawn to fairly progressive urban environments (ever heard of a large gay community in someplace like Topeka Kansas?). So, in fact, are the type of people who want to work at new-economy companies. Thus having a strong gay community becomes a reliable proxy for all the other things in an urban centre which make it desirable for young professions to reside, not the cause of that. In fact, you could replace any number of things with gay-tolerance as the proxy (such as immigrant populations).

    Which means that if you're setting up a company in an area which is gay-tolerant, you're probably going to end up employing some.

  3. Re:Another reason to stick to the RFC on New E-Mail Vulnerability - Trust Your Neighbor? · · Score: 2

    Netscape does this. I can tell it to use JavaScript for web pages, but to disable it for mail/news, which is precisely what I've done. I'm not sure which client you're using, but mine works just fine.

  4. Re:Enterprise-grade messaging for Linux/Unix on What Mailbox Format Do You Use And Why? · · Score: 2

    this is basically X.400, upon which OpenMail is based. But instead of having a JDBC/ODBC/Whatever link to a relational database, the architecture is to have a mail-specific "mail store" which stores the mail in SOME way, and then client tools which just talk to the mail store (and client tools in this case are things like an IMAP daemon). It's basically this model, assuming that you are dealing with your mail store as a Mail-specific interface rather than raw SQL.

  5. They're under copyright. on Methods For Shorthand Notetaking? · · Score: 2

    Gregg shorthand is under copyright, and that's why you can't find it online. The Gregg copyright holders (I think he's dead, but his family ain't) gets money for every Gregg book and notepad (I'm not kidding on that....if you see something with "Gregg Rule" on it, they get money). They have no incentive to put it online for free right now.

  6. Asking for a Coke on The Pillsbury Doughboy vs. Engineers · · Score: 2
    My brother, who IS a lawyer, told me that the reason why whenever you go to a restaurant where they serve Pepsi and you ask for a Coke they ask you "Is Pepsi okay?" is because of the Coke Lawyers.

    They actually send associates around when they're on travels to various restaurants and check the responses. Those who don't clarify are asked (politely, according to my brother) to correctly phrase the question, and the store goes in the big bad offenders database.

    It's their way of preserving the fact that a Coke is their particular beverage. They (the brand owners) really DO take this stuff seriously.

  7. Re:You're just inconveniencing the Post Office on Stuffing Junkmail Postage-Paid Envelopes? · · Score: 5
    When I used to work for an insurance company, and I dealt with a lot of mail (bill payments from customers, not junk mail responses), that wasn't quite true. According to the office manager, we paid a license to be able to do Business Reply mail. But we got a bill every month from the postal service with the actual amount of things which were returned.

    So you're half right. There is a cost just to be allowed to spam you with those envelopes. But it does cost the company per-envelope.

    I can't remember if we got charged for the actual weight.

  8. Re:Hopefully so (was: Very good news) on ResierFS In Latest 2.4.1 Prepatches · · Score: 2
    I had the same thing. Which version of the kernel were you using? There was a known problem with smp.c which very seldom reared its ugly head, unless you had a FS which really taxed teh kernel. Ext2fs could very seldom do it. ReiserFS is "advanced" enough that it did it very often. It was fixed in 2.2.16. Not the fault of ReiserFS.

    My recipe for disaster was to have a really big benchmark running.

  9. Re:More info on ResierFS In Latest 2.4.1 Prepatches · · Score: 2
    At least in part, it's faster because rather than using a linear list of the files in a particular directory/inode, it uses a B+ tree. Which means that finding a particular file isn't an O(n) lookup, it's an O(logn) lookup.

    For large directories with a lot of files, this decreases the number of inode pages which are necessary to lookup a particular file. For smaller directories, it results in a logn lookup in memory (becuase the individual inode pages can be binary searched).

    Finally, journaling of the ReiserFS form results in a speedup becuase it can write the bigger inode block at some point in the future.

    NTFS was at least partially slower becuase the first version used a transactional scheme, which always introduces overhead. I don't know much more, but I know that at least the tree-structure for inodes in ReiserFS is responsible for quite a bit of the help. Our builds were 15% faster (becuase all that dependency checking hits the FS itself hard, much more than the disk itself).

  10. I'm assuming an RDBMS for the database. on RAID Solutions For Terrabyte Databases? · · Score: 5
    In which case, things get a whole lot trickier than just a bunch of files, because you have to consider what your usage pattern is (in terms of what the database is doing) and how that impacts the disk usage (in terms of HOW the database does it).

    The first thing is to talk to your DBA and get his/her input. DBAs, competant ones, have done a lot of this type of work in the past, and they'll have an enormous amount of help to provide you. They'll know your usage pattern by heart, and be able to provide you with some help as to usage.

    The first thing to realize is that for most RDBMS usage patterns, RAID is a Very Very Very Bad Thing. But when I say "most", I mean "most with updates to live data."

    RDBMS' use data in 4 main types of storage, and it's important to understand them:

    • Main Table Storage. This is where your data actually "lives", and is ironically the least important storage wise.
    • Temporary Table Storage. This is the storage space for temporary space and temporary tables, which is extremely useful for performance management.
    • Index Storage. This is where data indexing structures live, and is extremely performance critical.
    • Log Storage. This is where the log for your system lives (physical and logical) and is also extremely important for performance.
    The most important thing for performance is to PHYSICALLY segregate ALL four types as much as possible. For example, if you're going into multi-terabyte databases, you might want all four types of data not just on different disk arrays (i.e. RAID controllers), but also on different SCSI channels, and different host adapters (i.e. multiple SCSI or FC-AL cards) altogether.

    You also want to bear in mind that your update speed is limited by the ability to handle log writes. Log writes aren't limited by bandwidth. They're limited by the latency of each disk. Every disk can handle a certain number of operations per second. Even if you add more disks in a RAID configuration, you're never going to be able to handle more transactions per second, because you're not increasing the number of operations of any of them, and all of them must be touched for every transactional write.

    So with that in mind, allow me to recommend something:

    • Get a bunch (as many as you can afford) of 15k RPM disks. Each of those should be a separate log device. Spread them throughout your SCSI or FC-AL adapters, as evenly as possible. If you are going to use RAID, which you actually should for this, you should have quite a few RAID/1 matches, each one one log to one mirror. If you're using Solaris or another commercial UNIX, software RAID is fine for this as long as you have hot swap. Otherwise, use cheap hardware raid. Even if you're using FC-AL for everything else, you might want to consider plain, old SCSI for this stuff, becuase latency is your #1 concern, not bandwidth.
    • Get a bunch of smaller, but at least 10,000 RPM drives for your index storage. They should be on quite a few different hardware RAID adapters, and you should be using RAID/0 for them. For this, you don't care about losing a drive. The worst that can happen is reduced performance while you rebuild an index, you'll never lose any data. Create as many logical units as you can get away with, and again spread them out.
    • Get larger, not necessarily as fast drives for your primary partitions. These can and should be on very large RAID/5 partitions. Any commercial RDBMS will handle slower drives for these with very little additional overhead. The log and index partitions are your bottlenecks. Each SCSI channel or FC-AL adapter should have the bulk of its bandwidth be taken up by these. THIS, coincidentally, is where EMC comes into play, along with the Index storage.
    • For temporary space, get some hardware RAID adapters and some reasonably fast drives, and put them on RAID/0, not RAID/5. Again, this is not your core data, who cares if it goes down?

    The number one advice I can give is to consult with others. If you haven't done this before, there are people (your DBA, your database vendor, your hardware vendor, your systems integrator) who have. This is serious business, and not something to screw around with. Terabyte-level databases are still NOT so common that everyone can and should attempt them. Having terabyte-levels of data throughout an enterprise is, but in one application it isn't. You'll probably not get it right the first time, so take your time and consult with every one of your vendors on capacity and performance planning.

    Not to be crass or mean, but if you're asking slashdot, you probably shouldn't be doing this all by yourself.

  11. Dumbass Lame Duck Politicians on Dark City, San Francisco? · · Score: 5
    Uhm, hate to spoil you with this, but in my understanding, the energy market deregulation was a last-ditch effort by a bunch of people who were just (about to be?) turned out of office in droves.

    The Republican-controlled state legislature AND governor's mansion have since been replaced with Democrat-controlled legislature and governor. When the legislation to deregulate passed, the GOP knew the writing was on the wall.

    I hate to tell you this, but we knew not to trust the bastards, and they got us in the end. Blame that CA state GOP, not the voters.

  12. Re:This is nothing new. on Dark City, San Francisco? · · Score: 2
    Quite a bit of the Bay Area, at least my apartment, uses natural gas for heating. But that's only part of the problem.

    Much of our energy reserves come from natural gas plants. Because it's been cold, natural gas prices have risen dramatically and supplies haven't. Which means that burning natural gas for heating or for electricity means little when everybody needs the same bit of natural electricity.

    Now if I can only get the significant other to turn off the (gas-powered) fireplace. Loves the atmosphere....sheesh!

  13. Re:The Bus a Bus on What Do You Need To Watch For In A Linux SMP System? · · Score: 2
    More interesting, though, is that this doesn't hold up past 4-way boxen.

    I know that the REALLY big Unisys/Compaq boxen (32-way P-III Xeons) use what's known as CMP, and I think that it's the basis in a scaled down form for most 8-way intel boxen. The acronym is for Cellular Multi-Processing, and the basic idea is that you have NUMA without calling it NUMA. Each "cell" is some number of processors (up to 4), some number of DIMMs, and a connector to a crossbar which connects everything.

    I know that this is a very common architecture for cheaper SMP type boxes. Crossbar complexity grows exponentially with the number of things attached to it, so you take some buses and do a crossbar between THOSE, rather than just a huge crossbar. And it works as long as you're not able to max out any particular bus.

    The core difference between something like CMP and something like the SGI NUMA boxen is that the SGI boxen have crossbars within each unit. So if you have a processing unit of 4 CPUs and 4 DIMMs, there's a crossbar between each of those, and a whole separate crossbar connecting each processing unit.

    Connecting more than 4 of anything on a bus is worthless. I can max out the bus on a dual-P-III pretty easily. Give me two more and you'll start to see declining performance as everything waits for memory to be copied.

  14. Re:Ho-hum on What Do You Need To Watch For In A Linux SMP System? · · Score: 2
    A couple of things:

    • What sorts of applications are you running? I've seen a MUCH bigger difference for multithreaded applications (on the order of 25% or more)
    • Athlons would be nice if there was a chipset which supported more than 2 of them (or even if THAT was available). Considering that we're talking specifically about SMP, Athlons are irrelevant.
  15. Consider particularly Alpha and SGI on What Do You Need To Watch For In A Linux SMP System? · · Score: 3
    And ignore Linux. This seems to be a free-from-zealots conversation, so let me point out that for $200k, you're buying a LOT of CPU. Linux won't scale to the levels of CPU that you're getting. The use of signal-level threading and other kernel details limits linux on a really big SMP box.

    SGI's systems are really designed for this. The NUMA architecture is a way to mix what you're doing (i.e. lots of cpus, lots of memory) with the ability to make some things "closer" to others. If you're able to think of your app at least slightly NUMA like, it works well.

    Otherwise, let me recommend the Alpha boxes. Tru64 UNIX is a phenomenal operating system, and it scales beyond your imagining. It really is that good.

    The other thing to think about is what kind of CPU usage are you doing. Assuming that you're using floating point computation, you need to immediately discount the Intel architecture. It's STILL hobbled by the terrible FPU that it's had for years, which is why the Athlon kicks it around on this stuff. Alpha and MIPS have the best FPUs implmented (SPARC is okay, but definitely not as good as the Alpha).

  16. XP is NOT a panacea on Clearcase vs. CVS? · · Score: 3
    I'm sorry, but I've worked on XP projects. Once you get the project above 5-10 people, you can do XP, but you HAVE to do it in XP groups. In other words, you find a group of like 10 people, give them control over a subproject, and then give them their own branches.

    What about stable release lines? Your team might have already started working on 2.0, but now you find a bug in 1.1. You expect all your customers to take an unstable 2.0 version? (and don't give me all this "always ready to ship" crap....if you don't have all the functionality ready, you're not ready to ship). You need to have multiple branches from there. Without good branch capabilities, you can't merge the fix in the 1.1 line with the 2.0 development.

    So you're saying that you NEVER branch, and thus NEVER merge? How do you handle long-running projects? Projects with multiple releases? Projects with 100 developers?

    I agree that complicated processes are bad. But processes should be there to help developers do their job. A good source control system with good branching and merging helps you when you need it.

  17. PLEASE consider perforce. on Clearcase vs. CVS? · · Score: 4
    The Perforce Website

    I've used CVS, PVCS, VSS, (but not ClearCase), and Perforce is the best system I've ever used, bar none. Let me go through some of the key points:

    • Quality of Support. I put this first becuase I think it's that important. I get same-day turnaround whenever I contact them and it's not a weekend (and I've gotten turnaround within an hour at 1am on sunday, so even then you often do), their tech support people know what they're doing and are willing to go the extra mile.
    • The Model's Like CVS. I think this is actually important. To get all the "advantages" of clearcase, you have to be using the ClearCase file system, which basically means that your compiles are over NFS. Do you really want that? P4 gives you a local copy of all the files, just like CVS.
    • Server Centric Model. While this can impose some difficulties for fully disconnected access, it saves your ass in a lot of places. Thinking of deleting a file and want to know if someone's working on it? You can see. Thinking of doing a branch and want to see the status of the branch on people's current machines? You can see. This can save you quite a few times.
    • Multi-Platform Support Rocks. I've NEVER found a platform which P4 doesn't support. Mac, Win, Linux (we're using it on Intel and Alpha), Solaris, even IA-64 if you've got one, they run on everything. Very nice.
    • Ease of Use. If you've got people who are familiar with ANY local-file based source control, they can be up to speed extremely quickly. I've got people who are familiar with CVS who are working extremely well in like 2 hours with P4, rather than weeks of training in CVS. So your training costs are virtually nil with P4 if your people already know CVS or VCC.
    • Speed. It's FAST. REALLY fast. Because of the server-centric model, it's able to determine extremely easily what you need to download and what you don't. It uses things like MD5 digests to determine whether you're actually in sync, it uses its database of what you've stored, etc. And then it just downloads a compressed delta of the file and modifies it locally. If you've got people working over not-so-fast links, this will save your ass.
    • It's TRANSACTIONAL. The basic unit of transactions is the Changelist. When you check stuff in, it's completely atomic. Either everything's submitted, or nothing is. That's it. So at any time you have, without labelling, a complete, transactional history of everyting that you've ever done, and it means that you can never download in inconsistent states. Everyone is consistent all the time. This also means that you don't have to constantly label everything, because the transaction ID acts as a unique identifier for the state of the database at any given point in time.
    • Branching/Merging Rocks. I've never had a case where P4's branching/merging support didn't work perfectly. They detect three way merges, they detect multiple lines of integration/development, it all just WORKS. I can't stress how important this is for quality development, and it's infinitely better than the support in CVS.
    • Ease of Administration. I administer 2 perforce servers. I spend, on average, about 2 minutes doing administration. The thing just works perfectly.
    • Cheap Hardware. Unlike Clearcase, you can get a fairly cheap box for P4. We're running 25 users with about 70 clients on a $5k box (and most of that is RAM and RAID array), and it's so fast that most people never even notice, because its downloads are about as fast as a filecopy on the network.
    • Stores in digested RCS files. So it gets around the corruption issue with some tweaking, but the core files are basically RCS files, which means that if you decide to give it the boot for whatever reason, you can integrate them into another source control system very easily. CVS will have the same issues with corruption that VSS does, because they dont' attempt to deal with the problems that RCS file corruption can incur.
    I'm a little partial, but it's an amazing system that you really should evaluate. I think that if you're looking at CVS and ClearCase, P4 gives you the happy medium: faster and cheaper than ClearCase, more enterprise-friendly than CVS.
  18. Re:Depends on your personal tradeoffs on Is The U.S. No Longer The Choice For Freedom? · · Score: 3
    True enough. But since most constitutional interpretation has very little to do with the "letter of the law," that makes more sense with respect to legal code than constitutional law.

    For example, the first amendment gives you the right to "freedom of speech." Does that mean you can draw a picture? Depends on your definition of "speech." The bill of rights is intentionally VERY vague to allow for a lot of interpretation as times change.

    A situation even worse is the Civil or Napoleonic Code (tangent: the only part of America with a formal Civic Law situation is Louisiana), where the letter of the law is that ONLY thing that matters. If the law says you may "speak freely," then it means "speak," not "use free expression of any kind."

    If we could have "letter of the law" arguments in constitutional law, we wouldn't need a Supreme Court to interpret what the constitution actually SAYS. It would be black and white. The fact that it ISN'T is what makes it so great.

  19. Re:Depends on your personal tradeoffs on Is The U.S. No Longer The Choice For Freedom? · · Score: 2
    Two Words: Parliament Supreme.

    That which Parliament grants can be taken away. In the absence of an external rulebook which trumps even Parliament, a "bill of rights," even if it's incorporation of the European Code of Justice, is meaningless.

  20. Re:Depends on your personal tradeoffs on Is The U.S. No Longer The Choice For Freedom? · · Score: 2
    Actually, yes, particularly within the EU, but also the US. While there are some temporary dispensations made (such as the US H1-B program), the basic problem is that no country, particularly in Europe, wants the hordes of unwashed masses to come in and "take their jobs." But to avoid the taint of ethnic discrimination, they block ALL people from entering. The situation in Europe is particularly dire right now. (for example, in Germany, citizenship is STILL based on your ethnicity....what percentage of German stock you have in your background determines whether you get to be a citizen).

    Even in the US, for example, most forms of immigration have been shut down. Unless you're on an H1B, getting a green card from, say, the UK is virtually impossible. There were 0 of them granted via the lottery over the past few years.

    If you're curious, just call the local consulate of your choice and say you're interested in immigrating. Find out what their response is.

  21. Depends on your personal tradeoffs on Is The U.S. No Longer The Choice For Freedom? · · Score: 5
    So there are plenty of micro countries where you might have more freedom (look for where they're doing money laundering/anonymous transactions). The issue is whether you're interested in the same standard of living as you're getting in the US.

    If you're really interested in keeping the same standard of living as you're getting in the US, you've only got a few choices, namely the EU, the US, Canada, and a few countries in Asia (Hong Kong, Singapore, and Japan notably).

    For Asia, you're dealing with a situation which might seem like it offers more things like privacy, but have much less open political processes (like Singapore) which might actually reduce your overall level of freedom.

    For the EU, while you'll get more chance to protect your privacy (the EU is much more forward thinking than the US when it comes to individual rights), many EU countries offer MUCH less than the US when it comes to the conventional US perspective on personal freedom (higher taxes, more government regulation, bizantine regulation on things the US takes for granted [like shop opening hours in Germany and the lack of a Bill of Rights in the UK]). So while you might get some things, you give up others in return.

    So it depends on what your personal tradeoff is. If you're most concerned with fighting your perceived corporatism, you want to leave. If you're mostly interested in your personal liberty, you probably want to stay.

    I can't really comment that much on Canada....can someone else fill in the gaps?

    But the entire question is completely moot, as national standards have completely removed your ability to emigrate to anywhere which is a developed economy (while you can LEAVE the US pretty easily, you can't go TO anywhere else). So you're pretty much stuck here regardless.

  22. Re:Of course it is. on Recharging Laptop From Plane Headphone Jacks? · · Score: 3
    The jacks that those will work with are specific recharging ports available on some airlines. Specifically, I've NEVER seen them in coach. In business class on international flights, I see them all the time. But even First Class on domestic flights seldom have them (flights are too short to make them that worthwhile, and they haven't upgraded the seats up there to the seats they can charge an extra $5k for).

    The specific question is whether you can charge your laptop on something which was not designed for it, and probably something you're going to find in Coach class.

  23. Re:my experience on Dot-com Unhealth Benefits Other Industries · · Score: 2

    Hmmmm....my experience (from startups in teh software field, NOT startups in the dot-bomb field) is that the opposite is true in project management. I've found that most startups suffer from a complete lack of project management, so no one really knows what everyone is working on.

  24. Re:What problems with the Kinesis? on Non-Traditional Keyboard Reviews · · Score: 2
    The Maltron Home Page.

    They're slightly more expensive than the Kinesis, but it's the same basic concept. Slightly different. I tried the Maltron before the Kinesis, and now I prefer the Kinesis. Be forewarned, though, I had some issues running Debian with my Maltron, but that was a LOOOOOOONG time ago (with the 2.0 kernel), so I dunno if they work now.

    If you like the general system for the Kinesis, but don't like the key action, you might like the Maltron quite a bit. Mushier keys. But if your problem is with key motion, the keys are actually farther away from each other (and larger) than the Kinesis, so it probably won't help. My problem's with Pronation, so both work fine for me.

  25. Re:What problems with the Kinesis? on Non-Traditional Keyboard Reviews · · Score: 2
    Yup, my problems were always caused by wrist motion (specifically, pronation) and the fact that my hands are kept completely neutral and stable throughout my typing with a Kinesis is probably what's made the most difference.

    The Maltron (you might want to try it) has very mushy keys, probably too mushy for my tastes. Quite frankly, when I don't have a significant tactile response to the keys it REALLY frustrates me and my typing accuracy goes way down. But then again the second fastest keyboard I've ever used was one of the PS/2 keyboards with the "Click" sounds, and even though my fingertips were numb at the end of typing on it I was hitting > 100wpm.

    Have you ever tried the Maltron?