You ask why someone would do SMP with the power and flexibility of Beowulf? And I can only answer you should do more research before you sing the praises of any technology.
Any cluster technology is only as good as its interconnect, to start with. Even were you to use gigabit ethernet, it would pale in comparison to even the fairly narrow bus that Intel processors run on. Looking at higher end crossbar architectures for SMP and you are looking at a huge difference in raw bandwidth, as well as latency.
Keeping this in mind, you have to start thinking about the cost of performing IPC between nodes or processes on disparate nodes accessing the same memory range. This becomes especially messy when you have to deal with cache coherency.
For example, lets say processor 1 on node 1 (P1N1) decides to read memory addess A on node 2 (N2). Fine, the contents of the address is shipped to P1N1 which stores it in its cache. Next processor 1 on node 3 (P1N3) decides to read that memory addess. N2 notices that the page is still mapped to P1N1, so it has to send a request to N1 to flush the cache lines of the processors so it can update the actual memory address so it can send the value to P1N3.
I would go further in depth but I don't think I am sufficiently eloquent enough at this time of the morning to explain it properly.
If the subject interests you, I would suggest picking up a copy of "In Search of Clusters: The Ongoing Battle in Lowly Parallel Computer" by Gregory Pfister. The book goes into great depths in describing the scalability issues with clusters as compared to SMP in chapter 6.:)
There is also another post by me in this thread talking more about the applications where Beowulf simply does not make sense.
*sigh* Unfortunately it seems that scalability cannot be brought into a discussions these days without a mention of Beowulf. And even more unfortunately the people who are first to propose beowulf as the end-all be-all solution for all your scalability needs are the people who don't understand basic concepts behind clustering.
This is not to disparage Beowulf in any ways. It is a wonderful piece of software for building large scale distributed processing systems that run custom (typically scientific) applications to one of the various parallel processing libraries such as PVM or MPI.
However, we have to look at the larger picture and consider the needs of more common computing tasks, such as high availability and scalability for business-class applications.
A "commercial" cluster of this type must provide application failover that is trasparent for clients. Beowulf does not do this because it is not what it is designed for.
For that form of scalability, you would be better off looking at a something like Veritas ClusterManager, Linux Virtual Server or TurboCluster. And even then you would have to look at what your application is and decide what approach to high availability is right for you.
As an example, look at the approach that TurboCluster uses. Incoming network connections do not go directly to the machines that service them; instead all incoming connections go to a dispatch server which then sends them to the back-end serves based on availability as well as current load.
Then compare that the the was Unixware Non-Stop Clustering works. With NSC each node in the cluster has its own IP address. In addition to that, there is a "Cluster Virtual IP" (CVIP) address that is used as the alias for the network card in one of the nodes. In the event of a failure of that node, one of the other machines in the cluster will alias its own interface to the CVIP and perform a gratuitous ARP broadcast, thereby overwriting the cache of any machines on the local network that may still contain the MAC address of the pervious owner of the CVIP.
The dispatch method has the advantage that it provides a simple method for providing both high-availability and load-balancing with very little added administration. However you do run into the problem that the dispatch server is then a single point of failure as well as a potential traffic bottleneck.
The CVIP method is more heavy-handed; however it does eliminate the single point of failure and does away with the need for a dedicated machine as a dispatcher.
Scalability is a more difficult issue. With Beowulf it isn't as much of an issue, as applications are written specifically to parallelize the task. However business applications are typically written to run as a single threaded process or as a group of processes communicating via IPC mechanisms. Not suprisingly, this will not scale on Beowulf.
The problem being that to distribute threads or IPC across a cluster, you need to maintain something refered to commonly as Single System Image (SSI). This can be thought of as the logical extension of SMP into the conecpt of clustering. Not only does each processor in a machine have equal access to all resources within that machine, each node has equal access to all resources within the cluster.
In other words, if an applications launches multiple processes, and one of those processes gets migrated to another machine, it has to be ABSOLUTELY transparent to the application. So you need things such as a single process list, a unified namespace for devices, some method for sharing devices (either via multiple paths or through shipped I/O), transparent IPC across the cluster, etc.
This is a very tricky proposition. And the sad part thing is that most applications still need modification to scale properly on SSI cluster, due to the bottlenecks in the interconnect technology. This is especially true when ethernet is being used as an interconnect as it suffers both from bandwidth issues and latency issues.
*looks*
It seems like I've gotten a bit to rambling. My point is that there is no single way to scale that suits all possible needs. Beowulf is one technology, service dispatch is another, SSI clustering is another, etc, etc.
> I had to deal with them a decade ago and I > found their pricing scheme to be confusing and > annoying.
It still is. As part of my job I unfortunately have to do pre-sales configs on SCO servers.
Customer: I need a copy of SCO Unix.
Me: Openserver 5, Unixware 2 or Unixware 7?
Customer: Can you give me parts for all three?
Me: Well, if its Openserver, we would need to know if they want host edition, enterprise edition, or desktop edition. We would also need to know how many users they have, as well as how many processors the machine has.
Customer: Oh. What about Unixware 2?
Me: Unixware 2 is a little easier. We would need to know whether they want the application server or the personal server. And again we would need a user and a processor count.
Customer: Hrmm... Is Unixware 7 any easier?
Me: No, actually its worse. This one has a lot more editions - base, business, departmental, enterprise and data center.
Customer: How do we know which one to get?!?
Me: It depends on how many users, the number of CPUs, how much memory you have, whether you need Windows file and print serices, whether you need a bundled backup software, whether you need a volume manager...
Customer: Nevermind. I'll call my customer back and get more information.
And that isn't even dealing with upgrades, which are even messier.
> And their OS is so bad compared to an advanced > multi-user OS like linux.
I agree with the sentiment, but I question your qualification to pass judgement on their product if you don't realize they don't have one single operating system.
To be more specific, they are *currently* selling Openserver 5.0.5, Unixware 2.1.3 and Unixware 7.1.1 (and yes, that is a different operating system that Unixware 2.1.x).
IMHO, their products really aren't terribly good, but mine is an informed opinion, having done support on all three operating systems as well as their previous products (SCO UNIX, SCO OpenServer 3.0, SCO OpenDesktop 3.0, SCO Xenix) and holding every certification they currently offer.
And it should be said that at least one of their operating systems does have some advantages over Linux. Unixware currently scales better, supports larger files and filesystems out of the box, has a proven extent-based journeled filesystem (vxfs), as well as a few other niceties.
That being said it is flaky as hell, over-priced, has a confusing licencing structure, has limited hardware support, limited ISV support, and is generally harder to work with than any of the free Unices on Intel.
>Also, she had second degree burns on her >genitals, since mcdonalds heated the coffee >signifigantly above the boiling point of water. >That is they heated it hotter than water can get >without special equipment. It really pisses me >off when people quote this case who no nothing >about it.
If it pisses you off so much, why are you doing it?
The coffee was around 180 degrees, which is well below the boiling point, and only slightly higher than my coffee pot (a normal Mr. Coffee consumer model).
I'm not sure what world you live in, but using the most popular brand of commercial coffee maker to brew coffee that is, at most 5-10 degrees higher than industry average does not translate to using "special equipment" to superheat liquids past the boiling point.
>As for refuting software RAID because hardware is >faster... try using a NetApp, and then tell me >software is slow. Yep NetApp (the high-high-end >of network-attached disk arrays) is software >raid.
Oh dear god, this gave me a good laugh. NetApp is "high-high end"???
First point to be made is that performance issues are going to be masked by the fact that it access is through an ethernet network.
You get the latency of ethernet, a free trip through the TCP/IP protocal stack, and the NFS driver as well.
Secondly, that they do not make a high end solution. A 1.4TB NAS device qualifies as a midrange product as best.
For high end you are looking at a fibre channel SAN with something like and EMC box (up to 10TB), an HP box (up to 10TB), an XIOtech box (up to 3TB), a Sun StorEdge box (up to 2.6TB) or a Compaq Storageworks box (up to 2.6TB).
>No one ever runs a NetApp off of a wire shared >with other services.
Actually, the SMB market does. But this is irrelevant. Repeat after me - network access has too high a latency.
>You can go to gigabit speeds over fiber, though >switched 100baseT is usually fine for file access
Repeat after me - network access has too high a latency.
>Databases tend to run faster on a NetApp than >local disks
Bullshit. Plain and simple. Maybe this is true in NetApps marketing department, but not in the real world.
High-end databases tend to run best with raw, asynchronous I/O on top of a raw disk. *NOT* on a network attached filesystem.
>EMC is good, but if you've got, say, a farm of a >half-dozen or more file or Web servers that all >need to see the same filesystem, NetApp is the >cleanest way to go
True, and if this is the market go ahead and use them. But its just plain ridiculous to claim that they represent the high high end of storage solutions.
>With just 4 drives, RAID 3 is the least expensive >in terms of CPU usage. It's also better if you're >most concerned with speed, in that 3 of your >drives will be focused solely on read/write >operations and the 4th will dedicate itself to >redundancy... > >Rather than splitting the data and redundancy >across drives. You add many more seeks...
You may want to brush up on your raid technology a bit.
RAID 3 is a stripe set with a dedicated parity drive that specifies small stripes, thereby accessing all disks in parallel.
This is fantastic if you are doing single large I/O operations. However for most applications this is a big mistake, as you are only performing a single I/O operation at a time.
And then there is RAID 4, which is identical to RAID 3, save that it uses stripes larger than typical files.
This has the advantage that we can now perform multiple read operations at the same time, but we still have the single parity drive which prevents the array from performing multiple writes simultaneously.
This may make sense on a read intensive database. However, if the database is static, and you are looking for high performance, you should look at RAID 0+1, i.e. a mirrored stripe set.
However many shops find the cost of mirroring prohibitive. And you do take a small hit on write operations.
And of course, our old friend RAID 5. This is the meat and potatoes of the RAID world. Reasonably low overhead on both reads and writes.
(As a closing note, I am kind of curious as to where you got the impression that using RAID 3 decreased CPU overhead. The truly expensive operation in RAID is performing the XOR, which is common to all RAID levels, save 0, 1 and 7.
Not that this is important. In the vast majority of scenarios you are going to run into it is the I/O that is the bottleneck, not the processor.
And furthermore, if you are using software RAID in a high performance environment, you are a fool. You can pick up a dual channel midrange DPT card with 32MB of ECC cache for around $900.)
In any case, since this was posted so late and noone is going to read it anyways, I think I'll shut up now.:)
Because there is an important lesson here that a lot of you don't seem to have learned.
To spell it out for you - these people are just that, people. Not some cartoonish corporate supervillians sitting around trying to think of ways to screw the public out of their god-given rights.
Don't get me wrong. I don't believe for one second that things like region encoding and CSS are beneficial to the public. And I am rooting for a sucessful defense against the MPA and DVD CCA in court.
On the other hand, I find the words and actions of many of the people here to be as distasteful as those of either of those organizations. Afterall, at least the MPA hasn't yet suggested murder as a way to enforce their opinion on others.
Re:Modern CPU's all have dedicated L2 cache
on
Cheap Gigabit Ether
·
· Score: 1
> By definition: if the kernel can only run as a > single-thread, it's Asymmetric Multiprocessing A > La the old Macintosh Multiproc machines.
Sorry to burst your bubble, but SMP and AMP are hardware terms, not software.
In an SMP system, each CPU has equal access to all memory locations and can control every I/O device.
On the other hand, an AMP system have certain resources (i.e. memory and I/O) exclusive to particular processors.
It is entirely possible to have a single-threaded kernel running on an SMP machine, essentially binding the kernel to one CPU and running user space applications on the additional CPUs.
In some cases this is done on purpose, to minimize context switches and to exploit locality - i.e. if a process is running exclusively on a processor, it doesn't have to deal with cache contention and minimizes the issues of cache coherence.
Magneto-optical is probably one of the most stable storage mediums available.
It can be rewritten up to 10 million cycles, has a shelf life of 50+ years, and are only affected by magnetic fields if you heat the surface to 300 degrees.
Another good contender with a higher data density is AME tape technologies, such as AIT, Mammoth and VXA-1. AME tapes are good for around 20,000 passes, with an archival life of 30+ years.
SLR is another good contender for high-capacity data storage, with a shelf life of 20+ years.
For large scale storage, using a form of heirarchical data management would be the best approach, with MO drives (which have a "mere" capacity of 5+ GB) serving out files that are still accessed on a regular basis, and using large capacity tapes on the backend (such as SLR100 or AIT-2, each boasting 100GB compressed).
As data warehousing becomes a more important industry HSM systems will likely integrate auto migration from media that is reaching the end of its archival lifecycle.
To be honest, a lot of what I'm going to say could be applied to multiple comments, but for the sake of convienence, I'll cram them all into this reply.:)
> Git the gun martha, it's another MacHead. Right. > They're associated with the application they > control. The interface is not modal. I'd really > love to see the visual chaos that would result > from focus-follows-mouse policies.
True, but there are users who honestly prefer this style of interface.
Me, I prefer my menubars bound to my application, since it keeps them closer. But I can see how some might prefer the modal interface.
> This is one of many reasons I despise the gnome > panel and the new KDE one. Not only do they not > extend to the bottom, they're now crammed into a > grid you have to aim the mouse at.
In Gnome 1.1.1, there is an option under the global preferences to "keep buttons flush with panel".
> The others being that it's too damn big, and > applications still maximize to cover it.
In Gnome 1.1.1, the size of each individual panel is configurable to 24, 36, 48, 64 or 80 pixels.
As for the applications maximizing over panels - your window manager is at fault. I use Sawmill without an issue.
> No argument there, though when using a Mac, I > often wished for a maximize button of some sort. > Minimize was easy enough (Hide) but it was a > mouse operation or a chaotic-looking "unhide > all" command to get it back.
Placement of buttons on window frames does not belong in arguments against Linux, as this is completely up to the user.
For example, I have no buttons on my window frames at all, since I hate aiming for tiny buttons on the corner of windows. I prefer to min with C+M+B1, max with C+M+B3, destroy with C+M+B3.
Likewise, I hate aiming for skinny little window borders, so I use M+B1 to resize and M+B2 to move, effectively making my "target area" one quarter the size of the window.
Not to say that my desktop is easier or more usable for other people. It is for *me*, the one using it.
And this is what Linux has above and beyond ANY gui out there.
Matthew Berg
(For those who love screenshots, here's my work machine : http://www.pce.net/wdomburg/work.jpg)
No, I was refering to the dock resizing, and for that matter recentering itself, when you add an icon.
Using the finder to keep track of running tasks is a much lighter approach.
And if you're going to have a dock, having it justified to one side or the other is a much lighter approach.
> Are you easily amused? Hard... to... type from > laughing at you...
Your credibility would be better if you didn't respond to an honest criticism with derisive comments.
> Ok UI guru, what icon does?
Have you ever heard the saying "You don't have to be able to lay an egg to smell a bad one."?
Yellow, as opposed to red or green, typically has the connotation of "caution". What does this have to do with minimizing?
For the record, I prefer virtually all other options I've seen, including Win3.1, Win95 and CDE defaults.
> Wow, are you one of those guys who also posted > everywhere about the iMac not having a floppy > drive?
Wow, are you one of those Mac zealots who defends Apple mindlessly as if they couldn't do anything wrong?
Again, for the record, yes, I questioned the logic behind releasing a system that neither came with removable storage nor had anything availabale for it.
> FYI genius, labels appear when the mouse nears > the buttons.
FYI, I am familiar with the concept of tooltips. They are a kludge necessary to overcome badly designed, non-intuitive icons.
There can either be a delay, which means you have to wait, read the tooltip, move the mouse so that it is no longer obscuring what you were trying to click on, and then finally click on the button.
Or there can be no delay, which is even more obnoxious, especially when you've gotten accustomed enough to the interface that you don't need them, again on the issue of them obscuring the screen.
> Ok, we're all waiting. Get back to us with your > report.
Though I have not had an opprotunity to use it, just looking at the screenshot shows me a couple problems with the Aqua interface.
* The buttons for min/max/close are not intuitive. I'm sorry, but yellow does not immediately say "minimize" to me.
* The buttons are hard to differentiate for those of us who are color blind. To make it worse, they chose to use red and green, which is makes it a problem for the majority of people with color-blindness.
* The close button is next to the rest of the buttons. This was one of the few aspects of the Mac interface I liked. They broke it.
* The dynamically resizing toolbar and alpha channel transparency. I know some people like them, but they are a waste of my CPU cycles. If they can be turned on, this isn't much of an issue.
* The apple menu is stuck higeldy pigeldy in the middle part of the finder. This makes it a much harder target to aim for. Another thing changed from traditional Mac OS layout that decreases usability.
I'm sure I could find more if I used the product, but that's enough for now.
You ask why someone would do SMP with the power and flexibility of Beowulf? And I can only answer you should do more research before you sing the praises of any technology.
:)
Any cluster technology is only as good as its interconnect, to start with. Even were you to use gigabit ethernet, it would pale in comparison to even the fairly narrow bus that Intel processors run on. Looking at higher end crossbar architectures for SMP and you are looking at a huge difference in raw bandwidth, as well as latency.
Keeping this in mind, you have to start thinking about the cost of performing IPC between nodes or processes on disparate nodes accessing the same memory range. This becomes especially messy when you have to deal with cache coherency.
For example, lets say processor 1 on node 1 (P1N1) decides to read memory addess A on node 2 (N2). Fine, the contents of the address is shipped to P1N1 which stores it in its cache. Next processor 1 on node 3 (P1N3) decides to read that memory addess. N2 notices that the page is still mapped to P1N1, so it has to send a request to N1 to flush the cache lines of the processors so it can update the actual memory address so it can send the value to P1N3.
I would go further in depth but I don't think I am sufficiently eloquent enough at this time of the morning to explain it properly.
If the subject interests you, I would suggest picking up a copy of "In Search of Clusters: The Ongoing Battle in Lowly Parallel Computer" by Gregory Pfister. The book goes into great depths in describing the scalability issues with clusters as compared to SMP in chapter 6.
There is also another post by me in this thread talking more about the applications where Beowulf simply does not make sense.
*sigh* Unfortunately it seems that scalability cannot be brought into a discussions these days without a mention of Beowulf. And even more unfortunately the people who are first to propose beowulf as the end-all be-all solution for all your scalability needs are the people who don't understand basic concepts behind clustering.
This is not to disparage Beowulf in any ways. It is a wonderful piece of software for building large scale distributed processing systems that run custom (typically scientific) applications to one of the various parallel processing libraries such as PVM or MPI.
However, we have to look at the larger picture and consider the needs of more common computing tasks, such as high availability and scalability for business-class applications.
A "commercial" cluster of this type must provide application failover that is trasparent for clients. Beowulf does not do this because it is not what it is designed for.
For that form of scalability, you would be better off looking at a something like Veritas ClusterManager, Linux Virtual Server or TurboCluster. And even then you would have to look at what your application is and decide what approach to high availability is right for you.
As an example, look at the approach that TurboCluster uses. Incoming network connections do not go directly to the machines that service them; instead all incoming connections go to a dispatch server which then sends them to the back-end serves based on availability as well as current load.
Then compare that the the was Unixware Non-Stop Clustering works. With NSC each node in the cluster has its own IP address. In addition to that, there is a "Cluster Virtual IP" (CVIP) address that is used as the alias for the network card in one of the nodes. In the event of a failure of that node, one of the other machines in the cluster will alias its own interface to the CVIP and perform a gratuitous ARP broadcast, thereby overwriting the cache of any machines on the local network that may still contain the MAC address of the pervious owner of the CVIP.
The dispatch method has the advantage that it provides a simple method for providing both high-availability and load-balancing with very little added administration. However you do run into the problem that the dispatch server is then a single point of failure as well as a potential traffic bottleneck.
The CVIP method is more heavy-handed; however it does eliminate the single point of failure and does away with the need for a dedicated machine as a dispatcher.
Scalability is a more difficult issue. With Beowulf it isn't as much of an issue, as applications are written specifically to parallelize the task. However business applications are typically written to run as a single threaded process or as a group of processes communicating via IPC mechanisms. Not suprisingly, this will not scale on Beowulf.
The problem being that to distribute threads or IPC across a cluster, you need to maintain something refered to commonly as Single System Image (SSI). This can be thought of as the logical extension of SMP into the conecpt of clustering. Not only does each processor in a machine have equal access to all resources within that machine, each node has equal access to all resources within the cluster.
In other words, if an applications launches multiple processes, and one of those processes gets migrated to another machine, it has to be ABSOLUTELY transparent to the application. So you need things such as a single process list, a unified namespace for devices, some method for sharing devices (either via multiple paths or through shipped I/O), transparent IPC across the cluster, etc.
This is a very tricky proposition. And the sad part thing is that most applications still need modification to scale properly on SSI cluster, due to the bottlenecks in the interconnect technology. This is especially true when ethernet is being used as an interconnect as it suffers both from bandwidth issues and latency issues.
*looks*
It seems like I've gotten a bit to rambling. My point is that there is no single way to scale that suits all possible needs. Beowulf is one technology, service dispatch is another, SSI clustering is another, etc, etc.
> I had to deal with them a decade ago and I
> found their pricing scheme to be confusing and
> annoying.
It still is. As part of my job I unfortunately
have to do pre-sales configs on SCO servers.
Customer: I need a copy of SCO Unix.
Me: Openserver 5, Unixware 2 or Unixware 7?
Customer: Can you give me parts for all three?
Me: Well, if its Openserver, we would need to know
if they want host edition, enterprise edition,
or desktop edition. We would also need to
know how many users they have, as well as how
many processors the machine has.
Customer: Oh. What about Unixware 2?
Me: Unixware 2 is a little easier. We would need
to know whether they want the application
server or the personal server. And again we
would need a user and a processor count.
Customer: Hrmm... Is Unixware 7 any easier?
Me: No, actually its worse. This one has a lot
more editions - base, business, departmental,
enterprise and data center.
Customer: How do we know which one to get?!?
Me: It depends on how many users, the number of
CPUs, how much memory you have, whether you
need Windows file and print serices, whether
you need a bundled backup software, whether
you need a volume manager...
Customer: Nevermind. I'll call my customer back
and get more information.
And that isn't even dealing with upgrades, which
are even messier.
> And their OS is so bad compared to an advanced
> multi-user OS like linux.
I agree with the sentiment, but I question your
qualification to pass judgement on their product
if you don't realize they don't have one single
operating system.
To be more specific, they are *currently* selling
Openserver 5.0.5, Unixware 2.1.3 and Unixware
7.1.1 (and yes, that is a different operating
system that Unixware 2.1.x).
IMHO, their products really aren't terribly good,
but mine is an informed opinion, having done
support on all three operating systems as well as
their previous products (SCO UNIX, SCO OpenServer
3.0, SCO OpenDesktop 3.0, SCO Xenix) and holding
every certification they currently offer.
And it should be said that at least one of their
operating systems does have some advantages over
Linux. Unixware currently scales better, supports
larger files and filesystems out of the box, has
a proven extent-based journeled filesystem (vxfs),
as well as a few other niceties.
That being said it is flaky as hell, over-priced,
has a confusing licencing structure, has limited
hardware support, limited ISV support, and is
generally harder to work with than any of the
free Unices on Intel.
>Also, she had second degree burns on her
>genitals, since mcdonalds heated the coffee
>signifigantly above the boiling point of water.
>That is they heated it hotter than water can get
>without special equipment. It really pisses me
>off when people quote this case who no nothing
>about it.
If it pisses you off so much, why are you doing
it?
The coffee was around 180 degrees, which is well
below the boiling point, and only slightly higher
than my coffee pot (a normal Mr. Coffee consumer
model).
I'm not sure what world you live in, but using the
most popular brand of commercial coffee maker to
brew coffee that is, at most 5-10 degrees higher
than industry average does not translate to
using "special equipment" to superheat liquids
past the boiling point.
>As for refuting software RAID because hardware is
>faster... try using a NetApp, and then tell me
>software is slow. Yep NetApp (the high-high-end
>of network-attached disk arrays) is software
>raid.
Oh dear god, this gave me a good laugh. NetApp is
"high-high end"???
First point to be made is that performance issues
are going to be masked by the fact that it access
is through an ethernet network.
You get the latency of ethernet, a free trip
through the TCP/IP protocal stack, and the NFS
driver as well.
Secondly, that they do not make a high end
solution. A 1.4TB NAS device qualifies as a
midrange product as best.
For high end you are looking at a fibre channel
SAN with something like and EMC box (up
to 10TB), an HP box (up to 10TB), an XIOtech box
(up to 3TB), a Sun StorEdge box (up to 2.6TB) or
a Compaq Storageworks box (up to 2.6TB).
>No one ever runs a NetApp off of a wire shared
>with other services.
Actually, the SMB market does. But this is
irrelevant. Repeat after me - network access has
too high a latency.
>You can go to gigabit speeds over fiber, though
>switched 100baseT is usually fine for file access
Repeat after me - network access has too high a
latency.
>Databases tend to run faster on a NetApp than
>local disks
Bullshit. Plain and simple. Maybe this is true in
NetApps marketing department, but not in the real
world.
High-end databases tend to run best with raw,
asynchronous I/O on top of a raw disk. *NOT* on
a network attached filesystem.
>EMC is good, but if you've got, say, a farm of a
>half-dozen or more file or Web servers that all
>need to see the same filesystem, NetApp is the
>cleanest way to go
True, and if this is the market go ahead and use
them. But its just plain ridiculous to claim
that they represent the high high end of storage
solutions.
Matthew
>With just 4 drives, RAID 3 is the least expensive
:)
>in terms of CPU usage. It's also better if you're
>most concerned with speed, in that 3 of your
>drives will be focused solely on read/write
>operations and the 4th will dedicate itself to
>redundancy...
>
>Rather than splitting the data and redundancy
>across drives. You add many more seeks...
You may want to brush up on your raid technology
a bit.
RAID 3 is a stripe set with a dedicated parity
drive that specifies small stripes, thereby
accessing all disks in parallel.
This is fantastic if you are doing single large
I/O operations. However for most applications
this is a big mistake, as you are only performing
a single I/O operation at a time.
And then there is RAID 4, which is identical to
RAID 3, save that it uses stripes larger than
typical files.
This has the advantage that we can now perform
multiple read operations at the same time, but
we still have the single parity drive which
prevents the array from performing multiple
writes simultaneously.
This may make sense on a read intensive database.
However, if the database is static, and you are
looking for high performance, you should look at
RAID 0+1, i.e. a mirrored stripe set.
However many shops find the cost of mirroring
prohibitive. And you do take a small hit on write
operations.
And of course, our old friend RAID 5. This is the
meat and potatoes of the RAID world. Reasonably
low overhead on both reads and writes.
(As a closing note, I am kind of curious as to
where you got the impression that using RAID 3
decreased CPU overhead. The truly expensive
operation in RAID is performing the XOR, which
is common to all RAID levels, save 0, 1 and 7.
Not that this is important. In the vast majority
of scenarios you are going to run into it is the
I/O that is the bottleneck, not the processor.
And furthermore, if you are using software RAID
in a high performance environment, you are a
fool. You can pick up a dual channel midrange
DPT card with 32MB of ECC cache for around $900.)
In any case, since this was posted so late and
noone is going to read it anyways, I think I'll
shut up now.
Because there is an important lesson here that a
lot of you don't seem to have learned.
To spell it out for you - these people are just
that, people. Not some cartoonish corporate
supervillians sitting around trying to think of
ways to screw the public out of their god-given
rights.
Don't get me wrong. I don't believe for one second
that things like region encoding and CSS are
beneficial to the public. And I am rooting for a
sucessful defense against the MPA and DVD CCA in
court.
On the other hand, I find the words and actions of
many of the people here to be as distasteful as
those of either of those organizations. Afterall,
at least the MPA hasn't yet suggested murder as a
way to enforce their opinion on others.
> By definition: if the kernel can only run as a
> single-thread, it's Asymmetric Multiprocessing A
> La the old Macintosh Multiproc machines.
Sorry to burst your bubble, but SMP and AMP are
hardware terms, not software.
In an SMP system, each CPU has equal access to all
memory locations and can control every I/O device.
On the other hand, an AMP system have certain
resources (i.e. memory and I/O) exclusive to
particular processors.
It is entirely possible to have a single-threaded
kernel running on an SMP machine, essentially
binding the kernel to one CPU and running user
space applications on the additional CPUs.
In some cases this is done on purpose, to minimize
context switches and to exploit locality - i.e. if
a process is running exclusively on a processor,
it doesn't have to deal with cache contention and
minimizes the issues of cache coherence.
Magneto-optical is probably one of the most stable
storage mediums available.
It can be rewritten up to 10 million cycles, has a
shelf life of 50+ years, and are only affected by
magnetic fields if you heat the surface to 300
degrees.
Another good contender with a higher data density
is AME tape technologies, such as AIT, Mammoth and
VXA-1. AME tapes are good for around 20,000
passes, with an archival life of 30+ years.
SLR is another good contender for high-capacity
data storage, with a shelf life of 20+ years.
For large scale storage, using a form of
heirarchical data management would be the best
approach, with MO drives (which have a "mere"
capacity of 5+ GB) serving out files that are
still accessed on a regular basis, and using large
capacity tapes on the backend (such as SLR100 or
AIT-2, each boasting 100GB compressed).
As data warehousing becomes a more important
industry HSM systems will likely integrate auto
migration from media that is reaching the end of
its archival lifecycle.
To be honest, a lot of what I'm going to say could :)
be applied to multiple comments, but for the sake
of convienence, I'll cram them all into this
reply.
> Git the gun martha, it's another MacHead. Right.
> They're associated with the application they
> control. The interface is not modal. I'd really
> love to see the visual chaos that would result
> from focus-follows-mouse policies.
True, but there are users who honestly prefer this
style of interface.
Me, I prefer my menubars bound to my application,
since it keeps them closer. But I can see how
some might prefer the modal interface.
> This is one of many reasons I despise the gnome
> panel and the new KDE one. Not only do they not
> extend to the bottom, they're now crammed into a
> grid you have to aim the mouse at.
In Gnome 1.1.1, there is an option under the
global preferences to "keep buttons flush with
panel".
> The others being that it's too damn big, and
> applications still maximize to cover it.
In Gnome 1.1.1, the size of each individual panel
is configurable to 24, 36, 48, 64 or 80 pixels.
As for the applications maximizing over panels -
your window manager is at fault. I use Sawmill
without an issue.
> No argument there, though when using a Mac, I
> often wished for a maximize button of some sort.
> Minimize was easy enough (Hide) but it was a
> mouse operation or a chaotic-looking "unhide
> all" command to get it back.
Placement of buttons on window frames does not
belong in arguments against Linux, as this is
completely up to the user.
For example, I have no buttons on my window frames
at all, since I hate aiming for tiny buttons on
the corner of windows. I prefer to min with
C+M+B1, max with C+M+B3, destroy with C+M+B3.
Likewise, I hate aiming for skinny little window
borders, so I use M+B1 to resize and M+B2 to move,
effectively making my "target area" one quarter
the size of the window.
Not to say that my desktop is easier or more
usable for other people. It is for *me*, the one
using it.
And this is what Linux has above and beyond ANY
gui out there.
Matthew Berg
(For those who love screenshots, here's my work
machine : http://www.pce.net/wdomburg/work.jpg)
> So...you spend alot of time... resizing...???
No, I was refering to the dock resizing, and for
that matter recentering itself, when you add an
icon.
Using the finder to keep track of running tasks
is a much lighter approach.
And if you're going to have a dock, having it
justified to one side or the other is a much
lighter approach.
> Are you easily amused? Hard... to... type from
> laughing at you...
Your credibility would be better if you didn't
respond to an honest criticism with derisive
comments.
> Ok UI guru, what icon does?
Have you ever heard the saying "You don't have to
be able to lay an egg to smell a bad one."?
Yellow, as opposed to red or green, typically has
the connotation of "caution". What does this have
to do with minimizing?
For the record, I prefer virtually all other
options I've seen, including Win3.1, Win95 and
CDE defaults.
> Wow, are you one of those guys who also posted
> everywhere about the iMac not having a floppy
> drive?
Wow, are you one of those Mac zealots who defends
Apple mindlessly as if they couldn't do anything
wrong?
Again, for the record, yes, I questioned the logic
behind releasing a system that neither came with
removable storage nor had anything availabale for
it.
> FYI genius, labels appear when the mouse nears
> the buttons.
FYI, I am familiar with the concept of tooltips.
They are a kludge necessary to overcome badly
designed, non-intuitive icons.
There can either be a delay, which means you have
to wait, read the tooltip, move the mouse so that
it is no longer obscuring what you were trying to
click on, and then finally click on the button.
Or there can be no delay, which is even more
obnoxious, especially when you've gotten
accustomed enough to the interface that you don't
need them, again on the issue of them obscuring
the screen.
> Ok, we're all waiting. Get back to us with your
> report.
Thank you for your enthusiasm.
Though I have not had an opprotunity to use it,
just looking at the screenshot shows me a couple
problems with the Aqua interface.
* The buttons for min/max/close are not intuitive.
I'm sorry, but yellow does not immediately say
"minimize" to me.
* The buttons are hard to differentiate for those
of us who are color blind. To make it worse,
they chose to use red and green, which is makes
it a problem for the majority of people with
color-blindness.
* The close button is next to the rest of the
buttons. This was one of the few aspects of
the Mac interface I liked. They broke it.
* The dynamically resizing toolbar and alpha
channel transparency. I know some people like
them, but they are a waste of my CPU cycles.
If they can be turned on, this isn't much of
an issue.
* The apple menu is stuck higeldy pigeldy in the
middle part of the finder. This makes it a much
harder target to aim for. Another thing changed
from traditional Mac OS layout that decreases
usability.
I'm sure I could find more if I used the product,
but that's enough for now.
Though I'd prefer to assume this is a troll, I'm too damn cynical to do so. So just a few quick points on your claimed Microsoft innovations:
Pre-emptive multitasking - WRONG. This wasn't even the first desktop OS to sport this feature. AmigaDOS beat them to it back in 1985
Virtual memory - WRONG. This was supported in 3BSD, back in 1979
Portability to other processor families - WRONG. This wasn't even new for Microsoft, who released Xenix on Intel, Motrolla and Zialog back in 1980.