Slashdot Mirror


LibreOffice Calc Set To Get GPU Powered Boost From AMD

darthcamaro writes "We all know that the open source LibreOffice Calc has been slow — forever and a day. That's soon going to change thanks to a major investment made by AMD into the Document Foundation. AMD is helping LibreOffice developers to re-factor Calc to be more performance and to be able to leverage the full power of GPUs and APUs. From the article: '"The reality has been that Calc has not been the fastest spreadsheet in the world," Suse Engineer Michael Meeks admitted. "Quite a large chunk of this refactoring is long overdue, so it's great to have the resources to do the work so that Calc will be a compelling spreadsheet in its own right."'" Math operations will be accelerated using OpenCL, unit tests are being added for the first time, and the supposedly awful object oriented code is being rewritten with a "modern performance oriented approach."

20 of 211 comments (clear)

  1. If you need it you are doing it wrong. by 140Mandak262Jamuna · · Score: 5, Insightful

    If your spreadsheet needs a gpu to speed up calculations, you are probably misusing spreadsheets. I know most accountants love the spreadsheet and they make insanely complicated things using spreadsheets pushing it far beyond what these are designed to do. But if you have a spreadsheet that needs this much of cpu time to recompute, you should probably be using a full fledged data base with multiple precomputed indexing.

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    1. Re:If you need it you are doing it wrong. by Russ1642 · · Score: 5, Insightful

      Custom database applications are expensive and inflexible. Stop trying to tell people what they can't do with a spreadsheet.

    2. Re:If you need it you are doing it wrong. by buchner.johannes · · Score: 3, Interesting

      I agree. Also, if you rewrite structured code into a "performance oriented approach", you are doing it wrong.
      Write code so it is easy to understand. Then compilers should understand how to make it fast.
      This can only come from people who think code is for machines. Code is for humans to read and modify.

      --
      NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
    3. Re:If you need it you are doing it wrong. by Anonymous Coward · · Score: 5, Funny

      Spreadsheets are all rectangular. That's pretty inflexible. Show me a triangular spreadsheet and then we'll talk.

    4. Re:If you need it you are doing it wrong. by Kenja · · Score: 5, Informative

      Pivot Tables can have three or more axis.

      --

      "Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
    5. Re:If you need it you are doing it wrong. by BitZtream · · Score: 4, Interesting

      Thats not the issue. If your spreadsheet is SO larger that on a MODERN CPU, its slow ... you're doing it wrong.

      You can make insanely complex, application like spreadsheets, without noticing 'recalc' time. By the time you get to noticing 'recalc' time, you've fucked up.

      Caveat: OO.org is known to have some of the crappiest code in existence, so with the case of Calc, you don't have to make ridiculous spreadsheets to notice recalc time. GPU support won't fix the problem however as its not the math thats the issue, its the shitty logic code filled with stupid crap written by clueless devs that cause Calc to be so slow.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    6. Re:If you need it you are doing it wrong. by gstoddart · · Score: 3, Insightful

      you should probably be using a full fledged data base with multiple precomputed indexing

      Well, you can put together a spreadsheet in a few hours.

      What you're describing is likely months of custom development and design, and a whole new thing to maintain.

      Spreadsheets are popular because they're easily deployed, don't require any extra licensing, and the people who know how to use them can likely do things with them that some of us would be astounded at.

      I know people who use spreadsheets for pretty much everything, because it's available to them readily, and they've been using them for a long time.

      It's all well and good to suggest that they use a full-fledged database -- but in reality, they can probably get something useful in a few days for a fraction of the cost.

      It sounds like in this instance, the code was just horribly inefficient.

      --
      Lost at C:>. Found at C.
    7. Re:If you need it you are doing it wrong. by robthebloke · · Score: 4, Informative

      I agree. Also, if you rewrite structured code into a "performance oriented approach", you are doing it wrong.

      Nonsense. One of the joys of C++, is the lack of reflection. This tends to lead apps down the route of wrapping everything into an 'Attribute' class of some description, and wiring those attributes together using a dependency graph. The problem with this (very clean OOP) approach, is that it simply doesn't scale. Before too long, this constant plucking of individual data values from your graph, ends up becomming a really grim bottleneck. If you then run the code through a profiler, rather than seeing any noticeable spikes, you end up looking at an app that's warm all over. If you're in this situation, no amount of refactoring is going to save the product. You're only option is to restructure the

      The "performance oriented approach" is the only approach you can take these days. Instead of having a fine OOP granularity on all of your data, you batch data into arrays, and then dispatch the computation on the SIMD units of the CPU, or on the GPU.

      Then compilers should understand how to make it fast.

      Uhm, nope. Sure, if you happen to have 4 additions right next to each other, the compiler might replace that with an ADDPS. In the case in point however, you'll probably expect a generic node to perform the addition on single items in the table. As such, your "addTwoTableElementsTogether" node isn't going to have 4 floating point ops next to each other, it will only have one. Compilers cannot optimise your data structures. If you want to stand a chance of having the compiler do most of the gruntwork for you, you actually have to spend time re-factoring your data structures to better align them with the SIMD/AVX data types. Some people call this a "performance oriented approach".

      This can only come from people who think code is for machines. Code is for humans to read and modify.

      Bullshit. This can only come from experienced software developers who understand that the only approach to improving performance of a large scale app, is to restructure the data layout to better align it with modern CPUs. There is *NOTHING* about this approach that makes the code harder to read or follow - that's just your lack of software engineering experience clouding your judgement.

    8. Re:If you need it you are doing it wrong. by jellomizer · · Score: 3, Insightful

      Spreadsheets are good for "throwaway applications" you need to do these calculations fast or gather data, and after a few weeks you don't need it anymore.
      If you are going to be following a process with a fairly rigid data sets. You are going to be better off spending the time and money to make a real application with a real database with it. That way the rigidness is to your favor to prevent incompatible creep, and allow for future data gathering abilities.

      Using Spreadsheets for your application needs works but it is very flimsy and over the long run you will be spending a lot more time fixing your mistakes (say a bad sort) Or a mistime change and save, or just the wrong click of your mouse you messed up a lot of data.

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    9. Re:If you need it you are doing it wrong. by gstoddart · · Score: 4, Insightful

      You're a developer. Good for you. Good for me too. But our jobs are not to make patronising unrealistic suggestions to smart people who don't have our particular skillset. Our job is to make it easier for other people to do their jobs. Telling them to hire programmers or run off and learn our skills isn't "making it easier".

      This. A thousand times this.

      Somewhere along the way, our industry has developed a collective mentality "we're smarter than you, and we will give you what we want even if we have no idea of what you need".

      Once you get a little further removed and realize that the stuff we're writing/supporting is intended to help the people who do the real, bread and butter parts of the business -- you start to realize if we're an impediment to them, it's worse than if we weren't there at all.

      They're not interested in some smug little bastard looking down his nose at them because they couldn't possibly do what he does. They're interested in getting their stuff done as quickly as possible.

      I can tell you there is nothing more frustrating and counterproductive than some kid straight out of school who thinks the world needs to bow at his feet and stand aside to allow him to tell them how they should do things. Sadly, I've also met developers who have been in the industry a long time who still act like that.

      In many industries, the people who do the real work of the company have highly specialized knowledge, and software is just a tool. And that tool is either helping them get stuff done, or frustrating the hell out of them.

      Acting like we know better than they do (when we in fact know nothing at all about their domain expertise) is at best condescending, and at worst an impediment and a liability.

      --
      Lost at C:>. Found at C.
  2. Clarification by UnknowingFool · · Score: 4, Informative
    From the article:

    Calc is based on object oriented design from 20 years ago when developers thought that a cell should be an object and that creates a huge number of problems around doing things efficiently.

    The problem isn't that Calc is object-oriented but was designed such that many things depended on the spreadsheet cell.

    --
    Well, there's spam egg sausage and spam, that's not got much spam in it.
    1. Re:Clarification by Trepidity · · Score: 4, Interesting

      Yeah, and it sounds like the GPU angle is really just a hook to get AMD funding. The more important improvements will be refactoring the representation so it doesn't suck in the first place.

    2. Re:Clarification by should_be_linear · · Score: 3, Interesting

      Cell should be an object even today. Their problem is probably, that Cell object contains something like string object, so creating 1 million of cells meeds million pointers and allocations to million of strings, which is performance killer. What they need to do is: instead of string, put int handler of string into cell, and have all strings in single huge allocated blob (like: StringBlobMap object). Going away from objects to improve performance is rarely good idea.

      --
      839*929
  3. Refactor? APU? by JBMcB · · Score: 3, Interesting

    If the refactor is done properly I don't think the OpenCL acceleration would be necessary. Heck, 1-2-3 running on a 486 was pretty speedy.

    --
    My Other Computer Is A Data General Nova III.
  4. Re:How is it? by MightyYar · · Score: 4, Interesting

    I don't think most people say Calc is just as good as Excel - they say that it is good enough for most people. And that is probably true. I think my boss uses excel for simple formulas and for lists. I use Excel for anything not quite worthy of a Matlab script, so OpenOffice doesn't quite measure up for me but should work fine for my boss.

    --
    W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
  5. Right (Re:If you need it you are doing it wrong.) by WillAdams · · Score: 3, Informative

    Actually, the UI for Lotus Improv was quite nice and won some awards.

    Its (spiritual) successor, Quantrix Financial Modeler seems to be selling well enough, even w/ a $1,495 price point.

    I wish that Flexisheet (an opensource take on this sort of thing) would get more traction.

    --
    Sphinx of black quartz, judge my vow.
  6. Live a day in my shoes by sjbe · · Score: 5, Informative

    If your spreadsheet needs a gpu to speed up calculations, you are probably misusing spreadsheets.

    Or it just means that you have some pretty complicated calculations. More computing horsepower never hurts.

    I know most accountants love the spreadsheet and they make insanely complicated things using spreadsheets pushing it far beyond what these are designed to do.

    I happen to be an accountant as well as an engineer. What pray tell do you think spreadsheets were designed to do? (hint - it involves rapid data modeling) They aren't much use if the only problems you solve are toy problems. Plus they require relatively little training to use effectively. Someone can be trained to solve real world problems MUCH easier than with most other tools. Most of the problems I'm asked to solve are ad-hoc investigations into specific questions. I shouldn't need a four year degree on Comp-Sci to accomplish a bit of data modeling.

    But if you have a spreadsheet that needs this much of cpu time to recompute, you should probably be using a full fledged data base with multiple precomputed indexing.

    I use some rather complicated spreadsheets. A database would be of no advantage whatsoever for 99.9% of what I use a spreadsheet for. Furthermore a database would be a lot slower to develop, harder to update, and require significant user interface development. If I'm crunching sales data or generating financial projections a spreadsheet is almost always the easiest and most useful tool for the job.

    Databases come into the picture when: A) other applications need to interface with the data, B) the dataset becomes truly enormous, or C) the number of dimensions in the data exceeds 2 to 3. Sometimes I use databases. Most of the time they would be a waste of money, brains and time. Frequently when I actually need a database I'll create a mock up of the tables and calculations on a spreadsheet first which lets me work out the structure much more easily.

    While it is certainly possible to use a spreadsheet inappropriately, a spreadsheet should be able to handle a rather large amount of data and calculations before it chokes.

  7. Re:the problem with OpenOffice by Bert64 · · Score: 5, Informative

    It's well documented, you can find examples all over google, eg:

    http://hints.macworld.com/article.php?story=20111230095628470

    Infact there are many people who use libreoffice to open and convert corrupted (or very old) files which are making msoffice crash, libreoffice is far more tolerant of unexpected data in the input files as unexpected data is a given when attempting to reverse engineer undocumented formats.

    And to give one personal example, msoffice 97 onwards had a bug in the macro function whereby the line counting function ignored lines with bullet points, so we had an extremely kludgy macro which counted the lines and then iterated through looking for bullet points and increased the line count accordingly... MS decided to fix this particular bug in a "security update" for office 2003, but then reintroduce the bug in 2007... Obviously this kludgy macro catastrophically broke the day that patch got rolled out.
    I could understand if it broke going from 2003 to 2007, but not for what is supposed for be a security update to change something like that.

    Also even moving files between the exact same patch release of msoffice on different machines can cause problems with formatting, as it reformats depending on available fonts and printer settings.

    --
    http://spamdecoy.net - free throwaway anonymous email - avoid spam!
  8. Re:Libre Office Calc isn't that good. by gstoddart · · Score: 4, Insightful

    An unsuitable tool might do as a temporary substitute, but long term you really want to use an appropriate tool for the job.

    Look at it this way ... the 40-ton truck in your metaphor (Excel or something like it) is provided to everyone in the company from day 1. From the receptionist to the CEO, everyone gets a 40-ton truck. You know that everyone can carry the same stuff in their 40-ton trucks because they are all pretty much the same.

    Furthermore, before you even leave highschool, people tech you how to use that 40-ton truck.

    Now, imagine that you need to solve a new problem, which is shockingly similar to problems you've already solved.

    So you could go through 6 months to a year of fighting to get someone to help you build a station wagon with a baby seat and tinted windows, because the 40-ton truck is overkill. And you need to convince someone help pay for the station wagon since they didn't budget for one of those.

    After you've gone through all of that process, the station wagon has never materialized, the cost overruns make it look like you're buying a gold-plated Rolls Royce, but the engine is still a cardboard mock-up, and the people building it for you have forgotten to include headlamps, windshield wipers, turn signals, seatbelts, and a speedometer. But if you will submit a change order to have them build those, you can wait another period of time (and even more money).

    Or, you take the 40-ton truck to do what you need, take a little extra time to find a parking spot, and in the end you've got something which covered your needs in a shorter period of time and for no extra costs except your time. You can get to the grocery store and back in a few hours, and you're done.

    That is why people use spreadsheets and don't always jump straight for the custom application.

    --
    Lost at C:>. Found at C.
  9. Appropriate tool use by sjbe · · Score: 4, Interesting

    Thats not the issue. If your spreadsheet is SO larger that on a MODERN CPU, its slow ... you're doing it wrong.

    It is a relatively trivial matter to make calculations on a dataset slow regardless of the tool used. I work with datasets and related calculations all the time that would make for slow calculations if you hand coded them in assembler. The mere fact that it is slow in a spreadsheet as well has nothing inherently to do with it being worked on in a spreadsheet. Now if the spreadsheet can't handle 65K rows by 65K columns then it shouldn't offer that size table as an option. But most can handle datasets that size and larger without too much trouble. For rapid data modeling and ad-hoc analysis a spreadsheet can be pretty hard to beat.

    When people go wrong using spreadsheets it's usually one of a few ways. The one I see the most is when they take what should be a prototype analysis and turn it into a production tool. If you need to put a bunch of buttons and other interface tools on a spreadsheet THEN you are doing it wrong. The second is when they try to take analyzed data involving more than 3 dimensions. While it can be done it rarely is a good idea. Another I see is if they try to have more than one person working on the spreadsheet. If the dataset is truly huge or you require multi-user access or you need to interface with other applications then by all means use something other than a spreadsheet.