Pressed send too early (3am here). My example goes to show that one must not threat the Linux kernel (or any other kernel for that matter) as a single entity that "needs redesign". The code paths that we used might have been huge-SMP safe, but it's very likely that other parts of the kernel might not be. When talking about redesign, one should be specific: Are they talking about the memory management? network stack? driver layer? workqueues? scheduler? IRQ handling? etc, etc, etc.
The linked article doesn't really explain what was simulated and how. (At least not in depth, unless I missed something).
However, I cannot disagree more. In one of my previous projects we developed a kernel based traffic monitoring software that was originally designed when top of the lines servers (such as the HP DL-585G1) had 4-socket, single core CPUs (A total of 4 cores). The same software scaled -linearly- (*) when 4 socket, 12 cores (AMD Opteron) and 8 core / 16 thread (Xeon EX) CPUs were released. (A total of 48 cores on AMD Opteron machines and 32 cores / 64 threads on Xeon EX machines)
Now, if the Linux kernel (at least the short code paths that we used) had severe problems with scaling above, say, 48 cores, a 64 core Xeon EX performance would have scaled poorly compared to 48 core AMD Opteron machine, let alone a 24 cores Xeon EP (dual socket), but our experience seems to suggest otherwise.
- Gilboa
* Actually, we saw better-than-linear scaling, but this can be attributed to the huge L2/L3 caches in modern CPUs.
Why someone voted your post as insightful is beyond me. Vocal idiots sit on both ends of GW debate: From the "think about the children" idiots on one end, to "The UN black helicopters will take over the old" idiots on the other.
Your assumption that people who watch Fox (most likely due to them being right-conservative as opposed to the left-liberal main media) automatically makes them idiots, puts you smack in the middle of the "think about the children" camp.
Lets assume, for a second, that I have a system that scales linearly from 12 to 48 cores (2S Xeon 56xx/Opteron 2xxx to 4S Opteron 6xxx) and requires ~4GB per core. (48-192GB) Now, lets assume that I'm willing to switch to ARM. 1. I will need a board that with at-least 24-96 soon-to-be-released ARM 2.0 Ghz Coretex A9 dual core core CPUs. 2. This board will also need unbelievably wide (and extremely complicated) crossbar, as I can no longer simply put 4 independent buses (one per CPU) and simply connect the sockets via a high-speed inter-CPU bus as AMD and Intel currently do. 3. Let alone the fact that this board will also require huge amounts of L3 cache / snoop-cache, as the built-in per-CPU cache will have a horrible cache hit/miss ratio. 4.... And this of-course, assuming the ARM can:
a. Address more than 64GB RAM. (As I recall, ARM was limited to 36bit)
b. Capable of working on MP configuration. 5. Lets assume that someone actually managed to solve all of the above, this doesn't solve one issue: This machine will have abysmal single thread performance. Even if my application scales nicely into 96 threads (and most applications don't), I will still have code-paths that will be core-speed dependent; and these code-paths will be dog slow on this machine.
In short, currently ARM doesn't come even close to replacing a cheap 2S Xeon/Opteron servers, let alone a super-high-end 4S/8S server.
Why do you assume that: A: PB storage is very rare and only used by several large organizations. B: PB storage is used to house generated data the can easily be replaced.
.. Argh. The OP message was hidden, so I missed the reason for your answer. (OP: RHEL is old and buggy, wish we could use CentOS or Fedora; You: CentOS -is- RHEL...)
(I use CentOS on development machines, RHEL for production)
1. Releases: Please compare the release date of say, RHEL 4.8 (19/5/09) to CentOS 4.8 (21/8/09). Or better yet, compare RHEL 5.5 (30/3/10) to CentOS 5.5 (will be ready when its ready). Now, CentOS devs tend to follow RedHat security updates fairly closely, and I usually see the CentOS updates ~12-48h after their RHEL parents. However: A. In production environment, I rather not wait 12-48h. B. Given the complexity of major updates (E.g. RHEL 5.5), CentOS tends to lag RedHat by a considerable margin.
2. Support: We once had a RHEL kernel fix, specifically tailored to our issue, within 24h. CentOS devs simply cannot compete with RedHat. Period.
Make no mistakes, I bow before the CentOS devs for maintaining a great distribution, but when my job is on the line, I rather put RHEL. Period. Nobody gets fired for using RHEL.
I can't vouch for other manufacturers, but I've got a number of Gigabyte GA-M55S-S3 and GA-M56S-S3 boards that came with single core Athlon64's or low-end Athlon64X2's. Most of the GA-M55S-S3's have been upgraded to high-end Athlon64 X2's (AM2) or 95w Phenom X4 CPUs (AM2+, Agena core). The somewhat younger GA-M56S-S3's are far more capable (latest BIOS added AM3 support), I plan to upgrade all of them to Phenom II X4 CPUs (AM2+ or AM3 Deneb), or better yet, if Gigabyte continues to release new BIOS' for these old boards (Some of them are close to 3 year old!), I may even upgrade one or two of them to Phenom II X6 CPUs.
Try changing ISP. I'm using 013 and I can't say that I notice any slowdown in skype or bit-torrent.... Most likely they are screwing around with my P2P (amule is ridiculously slow), but that's about it.
"How many times have you been able to do a 'yum update' or 'preupgrade' without having to worry about whether the system will be able to boot correctly?"
Around 99.9% of the time.
"How many times has anaconda crashed mid-install"
Aside from a known bug [1] (a well documented common bug), zero (or at least as far as I remember). And I've been using Fedora since F2 on machines ranging from PII/366 laptop to 24 core monsters.
"...or failed to detect your RAID and decided instead to wipe individual drives without really telling you, or any number of other nagging problems?"
0. Granted, I may not be as "successful" as you are- as I only manage ~20 different Fedora machines (as a side job) - but who am I to argue with your well documented arguments... (You forgot about RPM-hell. If you're taking the time to spread FUD, at least do it right!)
... I assume that you understand that you comment (and GPU problems) with a pre-release (...) Fedora 11 has -nothing- to do with the subject at hand (ext4 stability), right?
I tested the 64bit flash alpha shortly after its release, but and had several unexplained crashes. As such, I decided to wait until the next 64bit flash release.
Let me see. I actually do productive work on my machine, and said productive work actually uses the large chunks of the 16GB of RAM I have on my workstation. Beyond that, the lack of certain (stable) i386 plugins (*cough flash* *cough), requires me to pull-in 150 i386 packages [1] at the total size of 187MB [2]. While not much, given the pace in which Fedora (10) releases updates, ~5% of all my Internet traffic is dedicated to pulling packages that I wouldn't have needed in the first place - if someone have taken the time to take i386 behind the bike-shed and shoot it.... And don't get me started about the limitation of keeping i386 supported when writing multi-platform software. (Let alone kernel mode modules)
- Gilboa * The 64bit version of this plugin is still somewhat unstable - at least as far as I could test. [1] $ rpm -qa --queryformat="%{NAME}-%{ARCH}\n" | grep i386 | wc -l 150 [2] $ declare -i SIZE=0; declare -i CUR=0; for CUR in $(rpm -qa --queryformat="%{SIZE}-%{ARCH}\n" | grep i386) ; do SIZE=$(($CUR + $SIZE)); done; echo Total size: $(($SIZE / (1024*1024))) MB. Total size: 187 MB.
Less than 10% of the over-all ISP traffic in -any- of the major ISP I worked with was emails - POP3/SMTP or Webmail. Which means that even if 90% of the total emails are spam messages, we are still talking ~8-9% of the total traffic.
As other pointed out, most of the ISP traffic (50-70%, depending on the type of the ISP) is P2P.
Of course! The lack of a standard APIs (...) is the reason there no Linux clients for Quake, Doom, Unreal tournament and Egosoft's X-series! How could I've missed... Oh wait...
"does anyone seriously believe windows 2003 with sql server 2005 is a bad platform? i'd suggest if you do you've never used it."
Bad? No. Good? Depends on what you use it for.... Our DB department is currently switching from SQL2K8 on Win2K3 to RHEL 5.2/Oracle 10g and the performance is nothing short of staggering. I'm not a DB person (I'm the resident Linux geek), but least according to the benchmarks I helped setting up, we gained a 3/1 performance increase. (10g on Windows 2K8 was -far- less impressive.)
In our business environment, we will not upgrade to IE7 because it breaks business applications. No such limitations on FF3 (of course the apps don't work in FF2/3).
Being a Linux user, I've suffered dearly due to my companies insistence on using IE-only web-applications. Need-less to say, I'm now having the time of my life watching our (insert large number of curses in different languages) IT going through oops to prevent people from switching over to IE7 as it breaks most of these applications completely.... For me, the day we are forced to switch to Vista will be a second birthday. I've already pre-ordered a huge bucket of pop-corn and a gallon of Coca-Cola. I just can't wait!
True, 4 gen language can, at least in theory, take single threaded code and break it into multiple thread, however: A. It has nothing to do with OS itself; At least until someone creates embeds a.NET/java VM into the kernel. B. Using all cores != getting better performance/efficiency. * Given my current experience with.NET software developers (As someone that feeds them the information from a C/C++ based front-end) - I'm -far- from being impressed by C#. (We are actually scraping large pieces of existing.NET code and replacing it by C++ [under linux])
- Gilboa * You may argue that once we hit 80 cores, efficiency will lose all relevance. But given the fact that our Windows 2K3/.NET people require a 8000$ 8 core server to accomplish something that could have been executed by a 10 year old PII366 laptop running a slimmed down version of CentOS5 (I kid you not) and polling single-threaded C code - I beg to differ.
Without entering the oh-so-boring debate about monolithic vs. micro kernel design, I doubt that Vista's (many) problems have much to do with the NT kernel's basic design. (Though I must admit that I have zero experience in writing Vista kernel modules) Reducing the size of the kernel (in Win7) - while keeping the in-kernel DRM and Vista's bloated user-land will yield minimal (if any) performance gains while increasing the complexity of the OS by an order of magnitude. (Even if I agree with your main theme, that pushing user-land code into the kernel automatically yields an immediate performance boost - and I don't)... Oh, and given MS's tendency to develop weird semi-documented kernel APIs (did anyone say NDIS?), and their tendency to under-deliver and over-shoot schedules (by light-years) - I don't see any reasonable MS component manager risk his chair (pun-intended) pushing code into the kernel. (Especially given MS' apparent lack of interest in performance/optimization) - let alone C# factor.
Pressed send too early (3am here).
My example goes to show that one must not threat the Linux kernel (or any other kernel for that matter) as a single entity that "needs redesign".
The code paths that we used might have been huge-SMP safe, but it's very likely that other parts of the kernel might not be.
When talking about redesign, one should be specific: Are they talking about the memory management? network stack? driver layer? workqueues? scheduler? IRQ handling? etc, etc, etc.
- Gilboa
The linked article doesn't really explain what was simulated and how. (At least not in depth, unless I missed something).
However, I cannot disagree more.
In one of my previous projects we developed a kernel based traffic monitoring software that was originally designed when top of the lines servers (such as the HP DL-585G1) had 4-socket, single core CPUs (A total of 4 cores).
The same software scaled -linearly- (*) when 4 socket, 12 cores (AMD Opteron) and 8 core / 16 thread (Xeon EX) CPUs were released. (A total of 48 cores on AMD Opteron machines and 32 cores / 64 threads on Xeon EX machines)
Now, if the Linux kernel (at least the short code paths that we used) had severe problems with scaling above, say, 48 cores, a 64 core Xeon EX performance would have scaled poorly compared to 48 core AMD Opteron machine, let alone a 24 cores Xeon EP (dual socket), but our experience seems to suggest otherwise.
- Gilboa
* Actually, we saw better-than-linear scaling, but this can be attributed to the huge L2/L3 caches in modern CPUs.
Why someone voted your post as insightful is beyond me.
Vocal idiots sit on both ends of GW debate: From the "think about the children" idiots on one end, to "The UN black helicopters will take over the old" idiots on the other.
Your assumption that people who watch Fox (most likely due to them being right-conservative as opposed to the left-liberal main media) automatically makes them idiots, puts you smack in the middle of the "think about the children" camp.
- Gilboa
*Sigh*
Lets assume, for a second, that I have a system that scales linearly from 12 to 48 cores (2S Xeon 56xx/Opteron 2xxx to 4S Opteron 6xxx) and requires ~4GB per core. (48-192GB) ... And this of-course, assuming the ARM can:
Now, lets assume that I'm willing to switch to ARM.
1. I will need a board that with at-least 24-96 soon-to-be-released ARM 2.0 Ghz Coretex A9 dual core core CPUs.
2. This board will also need unbelievably wide (and extremely complicated) crossbar, as I can no longer simply put 4 independent buses (one per CPU) and simply connect the sockets via a high-speed inter-CPU bus as AMD and Intel currently do.
3. Let alone the fact that this board will also require huge amounts of L3 cache / snoop-cache, as the built-in per-CPU cache will have a horrible cache hit/miss ratio.
4.
a. Address more than 64GB RAM. (As I recall, ARM was limited to 36bit)
b. Capable of working on MP configuration.
5. Lets assume that someone actually managed to solve all of the above, this doesn't solve one issue: This machine will have abysmal single thread performance. Even if my application scales nicely into 96 threads (and most applications don't), I will still have code-paths that will be core-speed dependent; and these code-paths will be dog slow on this machine.
In short, currently ARM doesn't come even close to replacing a cheap 2S Xeon/Opteron servers, let alone a super-high-end 4S/8S server.
- Gilboa
Why do you assume that:
A: PB storage is very rare and only used by several large organizations.
B: PB storage is used to house generated data the can easily be replaced.
- Gilboa
Just for the record, which Fedora version are you talking about?
- Gilboa
.. Argh.
The OP message was hidden, so I missed the reason for your answer.
(OP: RHEL is old and buggy, wish we could use CentOS or Fedora; You: CentOS -is- RHEL...)
My mistake.
- Gilboa
(I use CentOS on development machines, RHEL for production)
1. Releases: Please compare the release date of say, RHEL 4.8 (19/5/09) to CentOS 4.8 (21/8/09).
Or better yet, compare RHEL 5.5 (30/3/10) to CentOS 5.5 (will be ready when its ready).
Now, CentOS devs tend to follow RedHat security updates fairly closely, and I usually see the CentOS updates ~12-48h after their RHEL parents.
However: A. In production environment, I rather not wait 12-48h. B. Given the complexity of major updates (E.g. RHEL 5.5), CentOS tends to lag RedHat by a considerable margin.
2. Support: We once had a RHEL kernel fix, specifically tailored to our issue, within 24h. CentOS devs simply cannot compete with RedHat. Period.
Make no mistakes, I bow before the CentOS devs for maintaining a great distribution, but when my job is on the line, I rather put RHEL. Period.
Nobody gets fired for using RHEL.
- Gilboa
I can't vouch for other manufacturers, but I've got a number of Gigabyte GA-M55S-S3 and GA-M56S-S3 boards that came with single core Athlon64's or low-end Athlon64X2's.
Most of the GA-M55S-S3's have been upgraded to high-end Athlon64 X2's (AM2) or 95w Phenom X4 CPUs (AM2+, Agena core).
The somewhat younger GA-M56S-S3's are far more capable (latest BIOS added AM3 support), I plan to upgrade all of them to Phenom II X4 CPUs (AM2+ or AM3 Deneb), or better yet, if Gigabyte continues to release new BIOS' for these old boards (Some of them are close to 3 year old!), I may even upgrade one or two of them to Phenom II X6 CPUs.
- Gilboa
Try changing ISP. ... Most likely they are screwing around with my P2P (amule is ridiculously slow), but that's about it.
I'm using 013 and I can't say that I notice any slowdown in skype or bit-torrent.
- Gilboa
I'd venture and guess - mid 2010. ... But as I said, I'm guessing. (Call it an semi-educated guess)
It should based around F11/F12.
- Gilboa
"How many times have you been able to do a 'yum update' or 'preupgrade' without having to worry about whether the system will be able to boot correctly?"
Around 99.9% of the time.
"How many times has anaconda crashed mid-install"
Aside from a known bug [1] (a well documented common bug), zero (or at least as far as I remember). And I've been using Fedora since F2 on machines ranging from PII/366 laptop to 24 core monsters.
"...or failed to detect your RAID and decided instead to wipe individual drives without really telling you, or any number of other nagging problems?"
0.
Granted, I may not be as "successful" as you are- as I only manage ~20 different Fedora machines (as a side job) - but who am I to argue with your well documented arguments... (You forgot about RPM-hell. If you're taking the time to spread FUD, at least do it right!)
- Gilboa
[1] https://bugzilla.redhat.com/show_bug.cgi?id=501057
Any known work around?
I'm an inch from installing IE6 under wine just to view SD....
- Gilboa
... I assume that you understand that you comment (and GPU problems) with a pre-release (...) Fedora 11 has -nothing- to do with the subject at hand (ext4 stability), right?
- Gilboa
I tested the 64bit flash alpha shortly after its release, but and had several unexplained crashes.
As such, I decided to wait until the next 64bit flash release.
- Gilboa
s/the lack of certain (stable) i386 plugins/the lack of certain (stable) x86-64 plugins/g
Let me see. ... And don't get me started about the limitation of keeping i386 supported when writing multi-platform software. (Let alone kernel mode modules)
I actually do productive work on my machine, and said productive work actually uses the large chunks of the 16GB of RAM I have on my workstation.
Beyond that, the lack of certain (stable) i386 plugins (*cough flash* *cough), requires me to pull-in 150 i386 packages [1] at the total size of 187MB [2].
While not much, given the pace in which Fedora (10) releases updates, ~5% of all my Internet traffic is dedicated to pulling packages that I wouldn't have needed in the first place - if someone have taken the time to take i386 behind the bike-shed and shoot it.
- Gilboa
* The 64bit version of this plugin is still somewhat unstable - at least as far as I could test.
[1] $ rpm -qa --queryformat="%{NAME}-%{ARCH}\n" | grep i386 | wc -l
150
[2] $ declare -i SIZE=0; declare -i CUR=0; for CUR in $(rpm -qa --queryformat="%{SIZE}-%{ARCH}\n" | grep i386) ; do SIZE=$(($CUR + $SIZE)); done; echo Total size: $(($SIZE / (1024*1024))) MB.
Total size: 187 MB.
Less than 10% of the over-all ISP traffic in -any- of the major ISP I worked with was emails - POP3/SMTP or Webmail.
Which means that even if 90% of the total emails are spam messages, we are still talking ~8-9% of the total traffic.
As other pointed out, most of the ISP traffic (50-70%, depending on the type of the ISP) is P2P.
- Gilboa
Of course!
The lack of a standard APIs (...) is the reason there no Linux clients for Quake, Doom, Unreal tournament and Egosoft's X-series! How could I've missed... Oh wait...
- Gilboa
"does anyone seriously believe windows 2003 with sql server 2005 is a bad platform? i'd suggest if you do you've never used it."
Bad? No. Good? Depends on what you use it for....
Our DB department is currently switching from SQL2K8 on Win2K3 to RHEL 5.2/Oracle 10g and the performance is nothing short of staggering.
I'm not a DB person (I'm the resident Linux geek), but least according to the benchmarks I helped setting up, we gained a 3/1 performance increase. (10g on Windows 2K8 was -far- less impressive.)
- Gilboa
In our business environment, we will not upgrade to IE7 because it breaks business applications. No such limitations on FF3 (of course the apps don't work in FF2/3).
Being a Linux user, I've suffered dearly due to my companies insistence on using IE-only web-applications. ... For me, the day we are forced to switch to Vista will be a second birthday. I've already pre-ordered a huge bucket of pop-corn and a gallon of Coca-Cola. I just can't wait!
Need-less to say, I'm now having the time of my life watching our (insert large number of curses in different languages) IT going through oops to prevent people from switching over to IE7 as it breaks most of these applications completely.
- Gilboa
3GB is a 32bit-only limitation. *
AFAIR 64bit processes have a theoretical memory space of ~63bits.
- Gilboa
* Modifiable by changing CONFIG_PAGE_OFFSET and/or __PAGE_OFFSET.
True, 4 gen language can, at least in theory, take single threaded code and break it into multiple thread, however: .NET/java VM into the kernel. .NET software developers (As someone that feeds them the information from a C/C++ based front-end) - I'm -far- from being impressed by C#. (We are actually scraping large pieces of existing .NET code and replacing it by C++ [under linux])
A. It has nothing to do with OS itself; At least until someone creates embeds a
B. Using all cores != getting better performance/efficiency. *
Given my current experience with
- Gilboa
* You may argue that once we hit 80 cores, efficiency will lose all relevance. But given the fact that our Windows 2K3/.NET people require a 8000$ 8 core server to accomplish something that could have been executed by a 10 year old PII366 laptop running a slimmed down version of CentOS5 (I kid you not) and polling single-threaded C code - I beg to differ.
... I guess I missed the joke.
In my defense, English is not my first language. My bad.
- Gilboa
Without entering the oh-so-boring debate about monolithic vs. micro kernel design, I doubt that Vista's (many) problems have much to do with the NT kernel's basic design. (Though I must admit that I have zero experience in writing Vista kernel modules) ... Oh, and given MS's tendency to develop weird semi-documented kernel APIs (did anyone say NDIS?), and their tendency to under-deliver and over-shoot schedules (by light-years) - I don't see any reasonable MS component manager risk his chair (pun-intended) pushing code into the kernel. (Especially given MS' apparent lack of interest in performance/optimization) - let alone C# factor.
Reducing the size of the kernel (in Win7) - while keeping the in-kernel DRM and Vista's bloated user-land will yield minimal (if any) performance gains while increasing the complexity of the OS by an order of magnitude. (Even if I agree with your main theme, that pushing user-land code into the kernel automatically yields an immediate performance boost - and I don't)
- Gilboa