Slashdot Mirror


Five Years After the Sun Merger, Oracle Says It's Fully Committed To SPARC

jfruh (300774) writes "Sun Microsystems vanished into Oracle's maw five years ago this month, and you could be forgiven for thinking that some iconic Sun products, like SPARC chips, had been cast aside in the merger. But Oracle claims that the SPARC roadmap is moving forward more quickly than it did under Sun, and while the number of SPARC systems sold has dropped dramatically (from 66,000 in Q1 '03 to 7,000 in Q1 '14), the systems that are being sold are fully customized and much more profitable for the company."

39 of 190 comments (clear)

  1. I'd love to buy some sparc hardware by kthreadd · · Score: 4, Insightful

    If it wasn't for that the price of the hardware can often be close to ten times higher than the equivalent x86 machine.

    1. Re:I'd love to buy some sparc hardware by davecb · · Score: 3, Informative

      It should be: around the time of the acquisition the price performance ratio finally got back to where it was with the first SPARCs: ten times the price, 100 times the performance .

      A 3U 4-socket T5 machine had about 128 full hardware threads (really: cores) the last time I looked seriously at it. The performance was a bit less than a 32-socket, 4-core-per -socket M9000, the machines I mostly worked with. In those days, I was a capacity planner and performance engineer at Sun Canada.

      A lot, but not everything, is still available open-source from SPARC International.

      --dave

      --
      davecb@spamcop.net
    2. Re:I'd love to buy some sparc hardware by fahrbot-bot · · Score: 2

      If it wasn't for that the price of the hardware can often be close to ten times higher than the equivalent x86 machine.

      At least for certain definitions of "equivalent" ...

      --
      It must have been something you assimilated. . . .
    3. Re:I'd love to buy some sparc hardware by Kjella · · Score: 2

      Sun was trying to sell hardware, I guess we all know how well that went. My guess is you said exactly the same before Oracle bought them. And you don't want to compete with Intel on economics of scale when the vast volume of x86 servers would never migrate to sparc, it'd be like a swimming contest wearing a lead vest. Oracle has quite rightly assumed that if they want to sell sparc hardware they have to create the market by making it the most cost efficient way to run Oracle the database. For anything else they're just measuring you up to consider what it'll cost to migrate all your code away from sparc and charge you just enough that you won't.

      My guess is that if you were one of the 7000 customers considering a custom sparc server that Oracle server would have god-tier support and installed and configured exactly to Oracle's specifications anyway, perhaps even by Oracle experts and it's probably also not the kind of system where you try to fix it yourself before you call support. For anyone that's been played the vendor game having Oracle control the whole stack from hardware to OS to software means you have one company responsible to fix it and fix it ASAP and they can just tune it however they want without caring about the general case. Whether it's sustainable over the long run we'll see, but "semicustom" hardware seems to be a growing trend as we can't just throw more cores with more gigahertz at the general case like we used to.

      --
      Live today, because you never know what tomorrow brings
    4. Re:I'd love to buy some sparc hardware by Jane+Q.+Public · · Score: 3, Informative

      A 3U 4-socket T5 machine had about 128 full hardware threads (really: cores) the last time I looked seriously at it. The performance was a bit less than a 32-socket, 4-core-per -socket M9000, the machines I mostly worked with. In those days, I was a capacity planner and performance engineer at Sun Canada.

      Okay, but it's not a one-for-one comparison. A hardware thread in one of those machines lies somewhere between an x-86 "core" and a GPU "core" in capability.

      Granted, they were powerful machines. But a 128-thread SPARC machine has nowhere near the capability of a modern x86 machine with 128 cores.

    5. Re:I'd love to buy some sparc hardware by serviscope_minor · · Score: 2

      Just for comparison, that was about a 3U machine.

      You can fit 64 cores and 512GB (1TB if you're rich) into 1U with commodity x86 servers. Last time I checked, power draw is about half the lifetime cost, so the SPARC servers would have to be awfully good on power draw/performance.

      On the plus side, the big T5 servers have larger system images than the cheap commodity x86 ones. So if you want to keep a fuckload of data in RAM, they could be worthwhile.

      --
      SJW n. One who posts facts.
  2. So how many Sparc Systems does Oracle Run? by mykepredko · · Score: 4, Interesting

    While reading TFA, my big question was if the Sparc has been improved so much, is Oracle using it in their systems?

    According to Wikipedia, Oracle has 122k employees; how many of them are running Sparc systems, how many of their internal servers are Sparcs? For a corporation of this size, I would expect, in three months, for them to consume a lot more than the 7k systems that were shipped in the latest quarter.

    When I was at IBM, the company was very proud to be its own best customer; is that true for Oracle?

    myke

    1. Re:So how many Sparc Systems does Oracle Run? by Enry · · Score: 2

      IBM no longer sells desktop and they're getting out of the server market as well, so I think they'll be Lenovo's best customer for the foreseeable future.

      As for Oracle, my guess is that these are Big Beefy Machines(tm) used as replacements for the IBM mainframes (which IBM still owns). They probably do use some in their back-end gear, but don't forget that Oracle also owns Oracle Linux and they have their own line of x86 hardware. That's more likely what they have most of.

    2. Re:So how many Sparc Systems does Oracle Run? by thaylin · · Score: 2

      They are getting out of the X86 server market, not the entire thing. They will still be selling power.

      --
      When you cant win, ad hominem.
    3. Re:So how many Sparc Systems does Oracle Run? by MouseR · · Score: 4, Informative

      Fore disclaimer: I'm employed by Oracle.

      My duties at Oracle have always been developing Mac and (more recently) iOS Apps. So I end up using Apple hardware. That aside, all developers have a remote unix account for development. We host server instances of the particular product we develop for, and have other such servers for hosting a number of dev tools. Many of those servers are hosted on "Sunacle" (my term) hardware.

      In the Mtl office, we also have a bunch of Sun stations in various places like some local server rooms and demo/training rooms.

      Sun is *everywhere* around me. Oracle has a huge investment in Sun (both in hardware optimisation and Java) which led to the acquisition 5 years ago. Was more of an investment rescue than a growth acquisition, if you ask me. BUT I DO NOT KNOW for sure if my point of view & assessment is what really led to the acquisition. Developers are not privy to such details.

  3. Unfortunately... by dfn5 · · Score: 3, Insightful

    ... VMWare is only committed to "commodity processors", namely x86, and I believe this is what doomed SPARC. I was a staunch Solaris admin/advocate and still love the hardware. However, Sun's virtualization does not hold a candle to VMWare. vmotion, storage vmotion, DRS and FT completely changed my life as a sysadmin. So at this point Sun hardware is not very useful to me in a datacenter. It is too bad because it was great.

    --
    -- Thou hast strayed far from the path of the Avatar.
    1. Re:Unfortunately... by pnutjam · · Score: 3, Informative

      The killer feature is portability and snapshot/template. You can clone machines quickly and move them to new hardware seamlessly (with the right licenses and backend). You can also make quick snapshot backups to backout changes or clone to new systems. With the right scripts you can scale up your cluster sizes dynamically.

    2. Re:Unfortunately... by Chris+Mattern · · Score: 4, Interesting

      Why would you use virtualization in such an environment?

      I can sum it up in one phrase: No. Hardware. Downtime. Ever.

      VMWare's solution enables you to move production servers at will without ever halting execution. Any hardware upgrade/replacement will have zero downtime. Even a hardware failure can be automatically migrated away from before it takes down the server and fixed without any down time.

    3. Re:Unfortunately... by HornWumpus · · Score: 2

      Except all the ones that do. You should get out more.

      --
      John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
    4. Re:Unfortunately... by afidel · · Score: 2

      That's nice, and for those of us who want to spend less than $1M per year per box there's VMWare which provides nearly the same uptime at a fraction of the cost, and we don't have to put up with IBM. Commoditization is a good thing for the customer.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  4. My suggestion to Oracle: SPARC everywhere... by mlts · · Score: 3, Interesting

    My suggestion to Oracle: Get SPARC's marketshare up. This might take some doing, but long term, expanding the ecosystem is a good way to keep revenue coming in, where customers buy new machines to upgrade, as opposed to "upgrading" to commodity x86 hardware.

    This would require some work on the whole stack from the CPU on up to applications. For example, getting Solaris LDOMs and domains to work with SCVMM or the enterprise admin tool of choice. Another would be getting Linux applications to work on Solaris with low to minimal porting necessary. IBM did this with AIX starting at 5L (where it took a code recompile, but little else.)

    As I mentioned before, Oracle has some pretty nice technologies which can shake up the market. SPARC servers have Infiniband, so if Oracle does some work with the hypervisor to allow one machine to access another box's disks via Infiniband, add redundancy (on both drives and nodes), this would completely get rid of a need for a SAN backend. Need more storage? Just add more drives to one of the machines, or add another node to the cluster, similar to how Isilons are updated. ZFS is also a crown jewel, and can be used for a lot of things as well, especially backend deduplication.

    I hope Oracle can reinvent itself. They have a lot of core technologies that they could use to eke out a definite niche in the enterprise. Combine that with the fact that SPARC and Solaris are mature technologies, and Oracle can bring to the table pretty decent security.

  5. "each system is more profitable for Oracle" by DCFC · · Score: 2, Insightful

    I'm glad that each system now makes more money for Oracle, I knew there was a reason for buying Sun/Oracle gear, it makes them richer.

    Just for a moment I thought there might be a reason *for me*.

    --
    Dominic Connor,Quant Headhunter
    1. Re:"each system is more profitable for Oracle" by Anonymous Coward · · Score: 2, Interesting

      I work for a company that was one of Sun's top 5 customers [ergo, "Anonymous"]. When Oracle took over, we were greeted with the elimination of most of our bulk discounts and an admonition to no longer deal with VARs. Since then, we've begun an aggressive Linux implementation program. The purchase of new Sun-branded hardware, in my rather small working group alone, has gone from several hundred servers a year to zero. We have pallets of decommissioned T-series machines in storage awaiting a trip to the scrap yard. It's too bad, Sun was a great company. So was HP, years ago.

  6. Re:My suggestion to Oracle: SPARC everywhere... by Anonymous Coward · · Score: 2, Insightful

    That's not how Oracle makes money. They buy popular but less profitable companies, and then jack up the prices on their product until everyone finally migrates to other systems. Once they've driven away all the customers of the acquired company, they buy another popular but unprofitable company and repeat.

  7. Uh, I've worked for Big Blue . . . repeatedly. by mmell · · Score: 2
    They never had me use a POWER workstation. Always Intel hardware . . . although they did finally manage to lose their addiction to M$-Windoze. Employees now are issued laptops with a rebranded version of RHEL installed.

    I would expect Oracle to follow a similar pathway, sticking with Intel hardware for its employees. I would not expect them to ditch M$-Windoze; unlike IBM, Oracle doesn't have a long acrimonious love-hate history going with M$.

    1. Re:Uh, I've worked for Big Blue . . . repeatedly. by dryeo · · Score: 2

      IBM was (is?) split into different fiefdoms and the PC division hated OS/2 and loved Windows and at the end they had enough pull to force the cancellation of OS/2-PPC when MS was going to refuse any special pricing if IBM continued with OS/2 development. Gartner also didn't particularly like OS/2 as well.

      --
      https://en.wikipedia.org/wiki/Inverted_totalitarianism
  8. Re:Why SPARC? by __aaclcg7560 · · Score: 2

    A brand new "SPARC Inside" sticker.

  9. Not so. by emil · · Score: 4, Informative

    If you examine the top two best performing database platforms (as benchmarked by TPC-C score) you will discover that they are both sold by Oracle, and that the SPARC version has both higher performance and a lower cost per transaction than the x86-64 version.

    You might find this quote to be particularly interesting:

    "I am going to make a promise to you," [Larry] Ellison said. "By this time next year, that Sparc microprocessor will run the Oracle database faster than anything on the planet."

    1. Re:Not so. by itzly · · Score: 2

      From the table, it looks like the Oracle is the fastest, but also the highest price/tpmC, while the Dell is the cheapest.

    2. Re:Not so. by AuMatar · · Score: 2

      The question is- is that because sparc is better, because Oracle optimized for Sparc like mad, or because they purposely degraded the performance on x86?

      --
      I still have more fans than freaks. WTF is wrong with you people?
    3. Re:Not so. by emil · · Score: 3, Insightful

      If I want Oracle PL/SQL in Postgres, I have to purchase EnterpriseDB. If you can get EnterpriseDB to give away the "deep Oracle compatibility" for free, many Oracle installations might switch. Let me know how that works out for you.

      I'd also like to see PostgreSQL in the TPC-C top ten. That's a lot of work, and for people who need scalability, they don't have time to wait.

    4. Re:Not so. by afidel · · Score: 2

      How about look at TPC-H, where in the 10TB test the T5-4 gets beat in performance by the DL580 G8 and the x64 system is half the cost per transaction? The number of shops that need more than 120 threads and 3TB of ram is vanishingly small.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  10. Re:My suggestion to Oracle: SPARC everywhere... by bears · · Score: 3, Informative

    AIX 5L+? Minimal porting? You've very obviously never actually done it.

    The total extent of IBM's efforts with AIX 5L was to put RPM 3.0.3 on their systems and build a few RPMs. The underlying source base for your RPM better support AIX or you're in for a good deal of fun. And you know what? Pretty much everybody dropped AIX support years ago for, I might add, very good reasons. AIX is a Unix, but a seriously weird one. Oh, and by the way, can you guess the version of RPM shipped with the latest AIX? Clue: it begins with 3. Check out the versions of packages at http://www-03.ibm.com/systems/power/software/aix/linux/toolbox/alpha.html. Most/all are seriously old. Many are a decade or more out of date.

    As someone who has to deal with targeting AIX (as well as Linux), from my developer PoV AIX is dead, dead, dead. And starting to smell very very badly.

    Meanwhile, Oracle have something like, what, 28k system sales per anum on which to amortize the cost of SPARC development? Pity. I loved old Sun kit, but sorry, SPARC is walking dead too. Just like AIX and POWER.

  11. Re:Why SPARC? by unixisc · · Score: 3, Informative

    From a general standpoint, very little - which is why their numbers have dropped from 66k to 7k in 11 years. They are apparently used to build systems still compatible w/ legacy Solaris systems, which is what enables their high margins. Otherwise, more than 64-bit, Intel's multi-core architecture, and the fact that it is several process nodes ahead of the SPARC, gives it a big advantage at the same price point, not to mention its support for several more modern OSs, such as Linux, BSDs, Windows Server 2012, and more. If you don't have a legacy SPARC establishment, there's no reason to go that route.

    As Intel found out w/ Itanium, the traditional disadvantages of CISC - wrt not only VLIW, but RISC as well - are obliterated: Merced originally resulted in only a 10% savings in die size - certainly not worth the complexity in the compilers and other costs incurred in building that platform. And once Intel tossed more cores into a CPU to scale up its performance, overtaking any other RISC CPU at the same price was no longer an issue. Especially since every OS for it - Windows, Linux, *BSD - support it

  12. Re:lot of talk by fe105 · · Score: 2

    Looking at spec.org jEnterprise2010 scores:

    http://spec.org/jEnterprise201...
    http://spec.org/jEnterprise201...

    A T5-2 gets a jEnterprise2010 score of 17k, an X4-2 11k (with half the memory and Oracle Broken Linux 5.9, why not 6.*?).

    The sparc has a list price of ~68k USD. Not sure what a two socket Oracle intel box costs; maybe 15k or so?

    sparc; 4 usd/score
    intel: 1.36 usd/score

    Sparc was nice once, but that was ever so long ago..

  13. Re:My suggestion to Oracle: SPARC everywhere... by HornWumpus · · Score: 2

    That's not the only way Oracle makes money.

    They also get companies to sign unreasonable contracts, then six months later 'hire away' the deal maker for 5-10x previous salary for a zero responsibility marketing job that lasts a few years.

    If you ever see that pattern on a resume _run away_. Not only is the person crooked, they can't manage money. The job should have left them set for life. Some are so greedy they try to leverage the salary history.

    --
    John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
  14. Re:Less than 110% by HornWumpus · · Score: 2

    110% = 25% mon-thr + 10% fri

    --
    John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
  15. What do Oracle Customers Say? by Anonymous Coward · · Score: 4, Interesting

    I worked at an Sun Micosystems shop. We bought thousands of their servers yearly and these wren't just cheap system, but the big E-class stuff for $500K-$3M each. The people were good to work with, the hardware lasted just a little longer than we wanted, and Sun was a nice company for the F/LOSS world.

    Then IBM offered a better golf deal to the CxO at that place and we were directed overnight to buy IBM whenever possible. The P-class stuff was cheaper than Sun's and AIX wasn't hard to use - we ran Sun, IBM, HP, and a few other systems - not a big deal.

    After a year, Sun came back with new architectures that added many more cores for next to nothing extra power. We went through a huge modernization effort to free up physical space in all our data centers and deployed virtualized servers as a default. It was fairly routine to swap 1 physical box for 10-20 older boxes. Nice.

    Then Oracle bought Sun and started the marketing takeover. Engineers know what I'm saying (VMware/EMC are similar). Then Oracle started behaving badly in the F/LOSS world, killed a few projects and started to stink up a few other projects.

    Never pay Oracle for anything except a DBMS - Oracle. Don't get consulting, and run, run, run away from their enterprise software stuff. Anyone who has been through 2-3 yrs of attempted deployments for these white elephants knows why. You will be sold the impossible and it will never be completed. At $300/hr per consultant, they will bleed your budget until you can find a scapegoat to fire, thus saving your own career.

    For us, the writing was clear - only buy Oracle HW when absolutely necessary and reduce our dependence on their DBMS to about 10% of our DBs. Go with Linux and x86 hardware whenever possible and use postgres for the DB unless really needed so there was real competition.

    What do other customers say?

  16. Re:My suggestion to Oracle: SPARC everywhere... by mlts · · Score: 2

    I do agree that AIX does stand for "Alien Interpretation of UNIX", but even though it is squirrely, if an application runs on it, it runs well.

    I am not disagreeing with the fact that AIX and Solaris are bit players. However, I would say that one problem is that both Oracle and IBM at best are focused on retaining existing customers. Neither have any marketing focus on getting people from VMWare and OpenStack onto their platforms. And without expanding the market, just as the parent stated if the market isn't growing, it is shrinking.

    This is a hard thing to do. The trend has been for businesses to have projects to get off of SPARC and POWER onto commodity x86 hardware, because x86 hardware has a price advantage, and can be sourced from a number of vendors. Both IBM and Oracle will have to have a good reason (good as in financially appealing), but it could be done.

    There is the security aspect. Solaris and AIX have long since went through their teething problems when it comes to security and are quite robust in this regard. Solaris has tossed root (as a user) in Solaris 11, and uses roles (this functionality can be reversed if needed), and AIX can run completely root-less, as well as use signed executables/libraries/scripts. If Oracle could put some R&D into security... and a reasonable way to manage/audit things, they might just gain some ground back.

    However, it would have to be a -major- improvement in security features, beyond the delta from Solaris 1.x to 2.x, something as major as the jump from Windows 3.1 to NT. Plus, it isn't just features, it is ease of implementation. Something where Solaris can be marketed as, "if it runs on this OS, it is secure".

    What might have to happen is that Oracle might have to license things from Microsoft. Exchange and Active Directory come to mind. This way, even if there is a major Windows exploit, core AD servers would still be protected because they would be running on Solaris. It is doubtful MS would license this, and it would take some coding by Oracle... but it is going to take a Herculean effort to get SPARC's marketshare to grow again anyway, so might as well try to get businesses to move to the platform by offering an alternative to a Windows backend.

  17. Consistent with a dying platform by quantaman · · Score: 4, Interesting

    A more telling stat was that in Q1 2003, Sun shipped 66,000 Sparc units, most of them Sun Fire servers, the commodity line. In Q3 of 2014, that number was down to no more than 7,000 units in the quarter. But he notes that while Oracle's unit sales are down, the devices it sells are very high-end and are fully configured and integrated with compute, storage, networking and software completely integrated.

    That isn't a refutation of the claim that Sparc is dying, it's just an explanation of how it happens.

    Sparc users are the same as any other group, the exodus starts with the fringe and then moves to the core. Casual low-profit customers found it easy to switch platforms so left a long time ago. The big high profit customers have high loyalty and massive sunk costs, it's hard for them to switch platforms so they'll be the last to go. If Sparc is dying then that's exactly the pattern I'd expect it to follow.

    --
    I stole this Sig
  18. Reputation by Livius · · Score: 2

    Oracle says a lot of things.

    They inflicted Fusion on my employer, and not a single claim about it has turned out to be true.

  19. Actual Solaris Sysadmin Here - Here's the story by AtariDatacenter · · Score: 4, Informative

    Solaris/SPARC is still going strong in large companies. One of the greatest advantages it has is that Oracle creates and supports the operating system, and Oracle creates and supports the hardware. (If you're running an Oracle database or some other piece of software, then that's an additional component that they create and support.) What this means is that if I'm having a problem, mundane or esoteric, I can go to one vendor and say, "Fix it." There isn't any bickering about what company's problem it is, and who manufactured my RAM, or any other the other silliness that crops up in vendor support. Large companies value this (as do us sysadmins). That also means they can do some very cool software tricks (which I'll mention a few here below).

    The decreasing unit shipments is just as much a sign of virtualization as anything. Right now, I'm looking at an older T5240, with two eight-core CPUs which presents itself as having 128 virtual CPUs (execution engines or thread engines), and 64gb of RAM. This is by no means the biggest box on the floor. We carve these up into smaller systems using either Solaris Zones, or LDOMs. That's two different methods of virtualization with two different goals.

    I did something great with an LDOM last week. I took a virtual server that was on the box and migrated the entire operating system and all the applications over to another LDOM... WHILE IT WAS STILL RUNNING. Aside from a quick (1 second) pause, the applications on the server had no idea that it just migrated to another piece of hardware while it continued to run. Slick! The original server had a failing DIMM. No worries, though even aside from ECC, the operating system automatically mapped out which parts of the DIMM were defective and retired the pages of memory so that they weren't constantly being exercised. Linux does all that... right? No?

    Someone else, above, said, "I don't think you can have a zfs system fail and move it to different hardware like you can with vmware...". Nevermind that we can migrate a running operating system and application to another piece of hardware and keep it running. Yes, of course if you have a hardware fault, you can bring it back up on another machine. The virtualization with Solaris is quite capable.

    In the environment of a large company where we're competing against Linux on the low-cost end of things, Solaris/SPARC is not only holding its own, but actually beating our Linux cloud counterparts in the costs of a virtualized OS/hardware. (I should ask my boss if we can publish a paper on this, because it is rather impressive.)

    On the high end of things, we completely dominate. We generally use a T5-4 for our internal cloud (which really isn't the biggest Oracle server out there). It has 64 cores, presenting 512 execution threads to the scheduler. RAM goes up to 2TB. If someone starts out on a tiny box with only one CPU and 4gb of RAM, we can scale them all the way up to the top by increasing their virtualization settings. No migrating to different or unusual hardware. If an application team can't scale their code horizontally (hey, it happens), they can go way vertical in this configuration. We haven't had a need yet for an M6-32 (32tb of RAM, and 32 of the 12-core CPUs (3072 execution threads or "virtual cpus"). We have Linux surrounded (on the low-cost side and the high-performance side) in a large enterprise environment, and that's why Solaris is still there.

    Now, I'm not an Oracle salesperson. But if Slashdot ever did an AMA with an Oracle sales engineer, I think my fellow Linux admins would be particular impressed on how far ahead Oracle/SPARC is in a number of key areas.

    1. Re:Actual Solaris Sysadmin Here - Here's the story by AtariDatacenter · · Score: 2

      > Wow. How impressive. Oh wait, Linux has had EDAC since 2006. But you keep paying your millions to Oracle. I'm sure its worth it.

      Actually, this might be worth an illustration. It was a long time back, so I'm sure I've forgotten a few details, but I'll give you the big picture.

      Around 2000, Sun Microsystems had a problem with the L2 cache on their 400mhz CPUs. It seems that IBM misrepresented the error rate on the chips, and they were having bit errors that were much higher than specified. Because of what was supposed to be an incredibly low error rate, they engineered the L2 cache with parity protection. That's enough to detect an error and cause a UE (uncorrectable error) event. So I know that your EDAC functionality in 2006 was in Solaris well before 2000.

      After that problem, Sun Microsystems did two things. First, they mirrored the L2 cache. Second, they completely beefed up their handler for CE/UE (correctable errors and uncorrectable errors) along the memory/cache/bus/cpu to bring it up to Enterprise level error handling. You get an Uncorrectable Error in your CPU's L2 cache. Do you panic? I looked over the EDAC documentation and I could be wrong (please correct me, if so) but it looks like that would result in a panic. Or you could just have it log that the UE event happened but take no action.

      What would Solaris do differently? It would find the page of virtual memory that had the corresponding error. Has it been modified? If not, just discard the page, log the event, and go on. There is a whole set of rules it goes through to determine the best way to keep the system running when it hits an uncorrectable error. Let's say that the page was modified and that there was an uncorrectable error in the L2 cache. We panic now, right? No. Solaris checks and sees who the page of memory belongs to. If it is a user process, then that process is simply killed (and the event logged) and the OS continues running. Only if it is a dirty page of active kernel memory do we have a panic.

      That isn't just recovering from a soft error. That's recovering from a hard error. So, as this story illustrated, there are quite a number of things happening behind the scenes in an enterprise level OS. You picked a good example with Linux EDAC.

    2. Re:Actual Solaris Sysadmin Here - Here's the story by Ereth · · Score: 4, Interesting

      I remember the first time I had a real hardware error on a Dell system running Linux. Straightforward enough, called out DIMM1. So I called Dell. They said "Oh, that doesn't necessarily mean a memory error. The way the PCI bus works that error could be on the bus itself, in the memory, or in the card in the first PCI slot. There's no way to tell".

      Seriously? No way to tell what "Error in DIMM 1" means? That's what the guy insisted. His solution? Turn the computer off and reboot. If it crashes again, call him back.

      This was on a Production database. No way was I going to just power off/on and wait for a follow up crash. I was used to sending Sun explorers and getting exact part numbers back for failures. If Dell couldn't do that, why were were playing this game?

      Dell finally agreed to send a technician with all three parts so he could diagnose and we could solve it with one downtime instead of several. But as a long time Solaris guy, I was totally disgusted.

      Sure, for edge servers, startups, small things, you can get away with that. But for business critical in Enterprise? I want better support from my vendor than "reboot and let us know if crashes again".