Slashdot Mirror


Load List Values for Improved Efficiency

An anonymous reader writes "Reduce the number of database hits and improve your Web application's efficiency when you load common shared list values only once. In this code-filled article, learn to load the values for drop-down lists when your Web application starts and then to share these loaded list values among all the users of your application."

38 of 207 comments (clear)

  1. You know what's great by Anonymous Coward · · Score: 3, Insightful

    When telling us there's code... TELLING US WHAT LANGUAGE. It's Java.

    1. Re:You know what's great by Anonymous Coward · · Score: 2, Funny

      Thank you so much for telling me this. I would have been completely stuck if you hadn't, I had missed the java headers right at the top of the first bit of code and thought it was ZX80 assembler.

    2. Re:You know what's great by ahmusch · · Score: 4, Insightful

      A database is not the be-all or end-all of computing; however, it is the be-all and end-all of data storage and access when things such as consistency, concurrency, and recoverability are at issue.

      If you take your example of codes in a file, how will you distribute it? How will you maintain it? How can multiple users access it? What happens if someone accidentally deletes it?

      Databases are designed to solve exactly these types of problems.

      You need to have the One True Copy of the data, in all cases. If you wish to distribute a flat-file or marked up copy of that data provided it's completely static aside from a software revision, then that's fine. But keeping key data unprotected -- or worse, opening up yourself to multiple masters -- demonstrates to me that you're thinking about things at the hobby/toy project level, certainly not at a distributed or COTS level.

      Your data is your application. It doesn't matter if the data is static or not. And for goodness sakes, if you're going to be using a database anyway for the dynamic data, eat the storage and keep the master copy in the freaking database!

      (And if you're not using a database at all, well, that's further evidence to me that you're thinking at the toy/hobby level of project scope.)

      If you were going to store large, binary data sets (such as video) in a database, I'd tell you to use Oracle and either store the data out-of-line with BFILEs, using the database as an indexing mechanism, or store it in the data as BLOBS, depending on whether access time (use BFILES!) or recoverability (use BLOBS!) was more important. I don't know what that project used, but writing either of the apps with those data storage concepts is pretty trivial.

  2. Changes to the lists? by saundo · · Score: 5, Insightful

    Interesting article, but preloading those values will invariably lead to out of sync conditions when the backend changes. Nothing mentioned in the text as to how to cater for that eventuality.

    --
    -- The problem with troubleshooting is that sometimes trouble shoots back.
    1. Re:Changes to the lists? by AndrewStephens · · Score: 3, Informative
      An excellent point, the article assumes that the data will not change very often, if at all. However, if the list data doesn't change very often then there is little point storing it in the database in the first place.

      Not to say that the article was actually bad or anything, its just a little light on when you would want to use this, and what some of the problems with this approach are.

      --
      sheep.horse - does not contain information on sheep or horses.
    2. Re:Changes to the lists? by Flibz · · Score: 2, Interesting

      I find that something that helps there, particularly if different users have diferent data for the list is to use javascript.

      Basically, create a session variable for the user (we'll call it cacheTimeStamp) with the current date/time to the hour (i.e. yymmddhh).

      When the page with the drop-down list is called, check current timestamp. If it's expired, create a new javascript .js file & delete the old one.

      Then, when the <script> tag is called, use thecacheTimeStamp value to ensure a current .js file is called and let the client browser handle the caching...

    3. Re:Changes to the lists? by Anonymous+Luddite · · Score: 3, Insightful

      >> posts here in Slashdot?

      Apples and oranges.

      You're talking about adding items (posts) to a recordset (slashdot thread). The items are static but the recordset is not. It changes frequently.

      The caching they're talking about involves a recordset that seldom changes, and would therefore be suited to storage outside a database and rebuilt as it changes - IE one trip to the database per change rather than one trip per view. This wouldn't make sense with something like a slashdot thread where records are added non-stop...

    4. Re:Changes to the lists? by ahmusch · · Score: 2, Informative

      Of course, the article leaves one key point unsaid but implied...

      Understand the nature of your data.

      In any system, there's basically three kinds of data:

      -- Static: This is the stuff that changes at a glacial pace, such as state codes, currency codes, and so forth. (for bonus points, put all the static code/description values in a single table with a type identifier for an even larger performance increase at the cost of slightly more complicated code.)

      -- Configuration: This is the data that drives the logic of the application. (Go ahead, put it all in code. Good luck with the maintenance.)

      -- Data: The actual records that you process or interrogate at some level to do something.

      If the JDBC monkeys don't understand the nature of the data at this fundamental level, then telling them to cache data is a recipe for concurrency and consistency nightmares. Not because they mean to, but because they don't know any better.

      Caching the static data is probably safe, but the configuration data is only *somewhat* safe to cache. It's too expensive to continually round-trip for everything, so what my project has done is implement a warm-boot process. For every transactional record it attempts to process, it checks a warm-boot status table. If the time has changed, the app flushes and repopulates its caches of both static and configuration data. Sure, it's a little kludgy, but it gives us at least parts of the best of both worlds.

    5. Re:Changes to the lists? by orasio · · Score: 4, Insightful

      Well, I develop apps in Java, and of course, apply a caching technique for lists.

      My data does change, but I store it in a tree.
      When someone changes a dropdown, I just erase the cache. When someone wants the list, it gets refilled. All of it is safely synchronized.

      Of course, I don't believe it's worth an article, and I don't believe it belongs in /. frontpage.
      IBM developerworks is a nice source of information when you want to program the mainstream way. It's good for teams, because it makes easily understandable code. Of course it's boooring and not news, though.

    6. Re:Changes to the lists? by willCode4Beer.com · · Score: 2, Informative

      I'm going to have to argue that storing data in static member variables is generally a bad idea. (Of course, so is using scriptlets in the JSP) Excessive use of static variables can lead to unpredictable behavior, and can be hard to debug. Also, syncing the data becomes more complex because of multi-threaded access to the variable.

      While working on high performance web apps where we want to cache the data and prevent having it become stale we genereally to store it in the application context ( ServletContext.setAttribute(name,object) ) with date information.
      So, you create a class to hold a date object representing the moment the data was cahed, and the data you wish to cache. Have your servlet or action class (you do have some kind of MVC right?) check that the date is not beyond your pre-dermined max life, if it is, re-fetch the data, otherwise, continue on. Some apps would continue on if another process was busy updating. Depends on requirements and load.

      This would also mean not fetching the data at application startup. Which, depending on your app, can be a good thing. If your app is deployed in a cluster, and different boxes may get different types of requests you may not actually need all of the data cached on every box.

      We also have a servlet (security controlled) that can be called to flush the cache. So, when an editor is using a content management tool and hits publish, the last step and updating the DB is to flush the cache.

      Of course, when I see a *java* programmer using the old *Hashtable* and Vector classes, I'm instantly prejudiced againt his code.

      --
      ----- If communism is a system where the government owns business, what do you call a system where business owns govern
  3. Huh? by Renegade+Lisp · · Score: 5, Insightful
    Sorry I don't get it. Of course, when you load your data from a cache in main memory, even from within the same address space, you are several orders of magnitude faster than if you make the trip to the database each time. And by several orders of magnitude I mean six to seven orders: you'll easily be a million times faster for a given operation. (A database roundtrip is on the order of tens of milliseconds, while a lookup in a Java hashtable takes mere nanoseconds on typical hardware.)

    What's the point? Since when is Slashdot a forum for random tech tips (and not very thrilling ones at that)? Did IBM pay to get this posted? Is Slashdot trying to make fun of IBM by actually posting it?

    1. Re:Huh? by 0x461FAB0BD7D2 · · Score: 3, Funny

      It's because of all the Adblock users out there that we get ads as stories now. The summary even reads like one. Or it's just a slow news day.

      It all depends on how tight your tin foil hat is.

    2. Re:Huh? by computational+super · · Score: 5, Insightful

      I think you're being FAR too polite here, sir. Feel free to drop in an occasional, "Are you f-ing kidding me with this drivel?" in your critique of this type of ridiculously simplistic and obvious article.

      On the other hand, there's a good take-away here. If this "revolutionary technique" was so mind-bending to IBM consulting services, I know where I won't be spending my consulting dollars...

      --
      Proud neuron in the Slashdot hivemind since 2002.
  4. Oblig. Simpsons Quote by bigtallmofo · · Score: 4, Funny

    "And people, I can't stress this enough. Put your garbage in the garbage can."

    "Garbage in the garbage can. Hmm. Makes sense."

    --
    I'm a big tall mofo.
  5. Slow news day? Christ. by Electroly · · Score: 5, Insightful

    This just in! Caching frequently-used data yields performance improvements! Film at 11!

  6. April Fools... by Kr3m3Puff · · Score: 4, Funny

    Is this April First?

    Wow, next an article about using "for" loops? The benefits of "bubble sort"? "Binary trees"?

    --
    D.O.U.O.S.V.A.V.V.M.
    1. Re:April Fools... by Vombatus · · Score: 2, Funny
      Careful there...

      Someone might hold the patents for those techniques.

      --
      This sig is intentionally blank
  7. Well duh! by AndrewStephens · · Score: 3, Insightful

    Keeping frequently used data in static singletons, who would have thought it!
    Seriously, this is probably good advice for someone just starting out programming, but I would expect anyone with any experience at all to know about this. Its hardly a revolutionary new technique.

    --
    sheep.horse - does not contain information on sheep or horses.
    1. Re:Well duh! by DaHat · · Score: 2, Insightful

      Be careful with mentioning design patterns like Singletons, you may lose most of the spaghetti code programmers.

  8. Done this for years by Ckwop · · Score: 5, Insightful

    Well thank you captain obvious.. I've been doing this for years with ASP. Just load the contents of the listboxes into the Application object.

    In ASP.NET you can even do cache invalidation when the database changes. Simply create an extended stored procedure that's fired when any of you update/insert producers run that write to the changed record ids to a Queue (using Microsoft's Messaging and Queuing service) then have a thread in the ASP.NET process that periodically check the queue for new messages and clear the values that have changed out of the cache.

    Because the Queuing service works across networks it's a really neat way to provide scalabity in web applications - if you can't wait for SQL 2005 which will provide cache invalidation on database updates as standard.

    Simon.

  9. PHP-ADODB Caches these queries by displague · · Score: 3, Informative

    With ADODB for PHP (and perl http://adodb.sf.net/, you can call CacheGetAll, CacheExecute, etc... The query resultset is saved to a temporary file. This avoids having to create the cache within the same function you would normally call without having to write extra code.

    [first useful post?]

    --
    Marques Johansson
  10. Is This Consultant on YOUR Payroll? by rimu+guy · · Score: 5, Informative

    If you have yet to read the 25-page FA, may I present the precis:

    Database hits are expensive. Reduce them where possible. For example, cache static lookup data.

    The simplicity of the point however is lost in the complexity of the article. It covers web.xml settings, servlet classes, list value loaders, persistence backends for said loaders, data source 'helpers' for said loaders, custom object classes for the loaders, several subclasses for said object classes, and a jsp page (to boot). Phew.

    The author refers to this design as a quick and easy approach. It is not. If you are not familiar with Java and read this article, please do not be put off. He could have demonstrated the point with a far simpler example. E.g. static variable, sql statement, jsp code, done.

    [The author] has worked with IBM Global Services for one year, and has five years of experience in J2EE-related technologies. And it shows. I dread to think how much a fully realized IBM Global Services project would cost should all its consultants apply this sized sledgehammer to each small task. Hopefully the article was not written up on the client's dime as well.

    --
    Java Hosting

    1. Re:Is This Consultant on YOUR Payroll? by CastrTroy · · Score: 2, Funny

      Is that "Lead" the element Pb, or "Lead" as in, in charge. Seems like his code is as heavy as the former.

      --

      Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
  11. GoF Decorator pattern is better for this task by MarkEst1973 · · Score: 4, Insightful
    Assuming one writes a well-factored application, I believe Decorator makes a much better caching mechanism that what the article presents.

    A Data Access Object should have 1) an interface (defining add, remove, retrieve, etc.) and 2) a standard implementation of the interface that reads/writes to the database on every method invocation.

    A Decorator can implement the data access interface, delegating all method invocations to a wrapped instance of the standard implementation. Decorate the behavior of the standard impl. by providing a cache, checking the cache before retrieving a model and updating the cache before saving a model.

    Because the standard impl. and the decorator share the same interface, you can have a factory create instances for you. Your code doesn't know or care which instance it is using. Mix and match Decorators to your heart's delight. A logging Decorator (track what data is being access, etc.) can be thrown into the mix, and again your calling code wouldn't be the wiser.

    This pattern is easily unit tested and load tested. It doesn't require a running web container to test or run. It Just Works(tm).

  12. Well, duh by Tim+C · · Score: 4, Insightful

    It's called caching, and it's been done since people had to load in commonly-used external references.

    I've not RTFA, so perhaps it's truly excellent, but why the hell has this been posted? Anyone who's writing any sort of application and not making intelligent use of caching is either really junior, or should probably be looking for a new job.

  13. Memcached by captainclever · · Score: 2, Informative

    Or you could use something like Memcached. Works with pretty much any language, and tonnes less hassle. (Thanks LJ ;))

    --
    Last.fm - join the social music revolution
  14. News? by Baavgai · · Score: 3, Funny

    Seriously, I've you're into servlets and you haven't been caching non dynamic data, shame on you.

    By same token, if you're on of those twits who caches five years of data warehouse information in the application layer, there's a special place in programmer hell for you...

  15. Not that bad a hint. by Ancient_Hacker · · Score: 2, Interesting
    Yes, it's obvious to anybody with half a brain. But from the looks of several apps out there, there's a LOT of coding being done by the lower half of the bell curve:
    • One very old 200,000 line app, written in assembler, that could only ever run on one particular CPU, had a #define for the number of bits in a packed character field (6 bits BTW).
    • One really losing Java app does about a bazillion (200+) separate SQL queries to ask for things that have not changed in 50 years. Funny, the app runs slowly, even on a rather hefty server cluster. It runs much slower than the old CICS mainframe app it replaced, which ran in one 30 MHz CPU, 4MB of RAM.
    • Many apps do SQL queries to get the names of the days of the week. And the names of the months. And the abbreviations for same.
    • It's a
    1. Re:Not that bad a hint. by ergo98 · · Score: 4, Funny
      • It's a


      Looks like you had an SQL Statement timeout there.

  16. And for the follow-up article... by azuroff · · Score: 5, Funny

    ...this "Lead Developer" at IBM will discover the wonderful performance benefits of pooling database connections, rather than open a new connection every single fricking time he hits the database. No wonder he saw a massive performance increase when he learned how to cache the lookup values.

    And what's up with the "obj_" prefixes in some of the code listings? Newsflash: Java is an object-oriented programming language, and most competent Java programmers can figure out that a variable called resultSet probably isn't a primitive. You don't need to waste another 4 characters pointing out that fact.

  17. What's your Vector, Victor? by Vengeance · · Score: 2, Insightful

    After bracing myself with some more coffee, I read a bit more of this article.

    Bad, bad, bad.

    What's with the Vectors, anyway? I haven't used those in years.

    --
    It was a joke! When you give me that look it was a joke.
  18. Re:Java Servlet init question .. by Vengeance · · Score: 2, Informative

    The individual list types are all implemented as singletons.

    That is, they have static 'getInstance' methods, maintain private static references to their own solitary instance, and have private constructors.

    This is a very standard OO pattern, and seems to be one thing the article got essentially right.

    --
    It was a joke! When you give me that look it was a joke.
  19. Externalize Picklists... by zettabyte · · Score: 3, Insightful
    He should be externalizing the picklists to a .properties file. He'll get the caching through Properties and the strings will be ready for localization.

    If the picklists are at all updateable while the application is running, he can cache as he does, but he'll a mechanism to invalidate the cache and re-read from the database.

    Forgetting that for a moment:
    1. Hungarian notation is NOT necessary in Java. Period. End of story.
    2. No one uses Vector anymore.
    3. There are some nice tag libraries, so STOP PUTTING JAVA CODE IN JSPS!
    If you're a code monkey, memorize these two quotes:
    Increasingly, people seem to misinterpret complexity as sophistication, which is baffling - the incomprehensible should cause suspicion rather than admiration. - Niklaus Wirth
    and
    ...it is simplicity that is difficult to make. - Bertholdt Brecht
    (quotes from http://www.vanderburg.org/Misc/Quotes/soft-quotes. html)
    1. Re:Externalize Picklists... by philipborlin · · Score: 3, Informative
      There are very good reasons not to use Vector. The main one is that it internally synchronizes all calls. Any algorithms 101 class will show you why that is a false sense of security.

      The classic example is:
      vector.size();
      vector.get(0);

      Each one of those calls is synchronized internally but the JVM can still switch threads inbetween the two calls causing a race condition. To make that code thead safe you need to synchronize externally:

      synchronized(vector) {
      vector.size();
      vector.get(0);
      }

      Now the code is thread safe but there were three synchronization calls (our explicit call and one for size() and one for get()). Very inefficient.

    2. Re:Externalize Picklists... by zettabyte · · Score: 3, Insightful
      There are other responses above mine which better address your questions/disagreements. My responses below are meant to be humorous and/or obnoxious, depending on your perspective.
      Why the hell not? Is it uncool now? Do I need to stop using it for l33t status?
      Yes. It's uncool. If you wish to be 1337, you need to stop using Vector. Only \/\/4|\|k3rz use Vector. At least that's what they told me in my Java class at DeVry(1). ** braces for DeVry graduate flames **
      One might argue it is easier to read especially for something simple.
      Those of use with inferior programming skills prefer to separate our View code (JSPs) from our Controller code (Servlets). We do so because we are not capable of remembering where some page setup code was originally written (Where is that list box set up? In the servlet? the display.jsp? the header.jsp include? the vars.jsp include? one of the includes in one of the includes? The base servlet? The action servlet? the form? the application initialization plugin? etc.). Following this practice, we always know where to go to find the setup for a view. However, someone with your superior coding skills, who can clearly remember where all page setup code exists within an application (whether written by you or not), need not adhere to this rule.

      No, it is I who hopes to never work with you. I would clearly be out of my league and unable to keep up, what with having to use design patterns like MVC to help me remember where to find code that I have (and even have not) written.

      Are you one of those types that spend all thier time critizing other people's code just because it is not how they would code it? I hope I never work with you.
      Yes. I am. I'll give you an example.
      "thier" != "their"
      and
      "critizing" != "criticizing"
      (1) I have nothing against DeVry graduates or their skills. I don't know any DeVry alumni, nor have never worked with anyone having any affiliation with DeVry. I'm merely making a joke.
  20. What, The Invention of Cache? by cowgoesmoo2004 · · Score: 2, Funny

    Thanks for the hot tip... I'm really anxious to see how the world changes now that the concept of the cache has finally been brought to light. Behold, sarcasm, I've invented sarcasm! Wow, maybe we should create a /. article for me?

  21. Data integrity by coyote-san · · Score: 4, Insightful

    The reason you would maintain "static" data in the database is for data integrity.

    For instance, the list of US states and Canadian provinces does not change very frequently. I think the last Canadian change was about a decade ago, the last US change nearly 50 years ago.

    The "full name, abbreviation" name used in most pulldown lists (full name as label, abbreviation as value) can obviously be considered static.

    So why keep it in a relational database? Simple - you can use it to provide referential integrity for all "state" fields in the rest of the database.

    This isn't a huge deal with states, but it can be very important with domain level enumerations. Your form actions may be well-behaved, but a robust system must account for clowns who feed their own data directly into your action URL.

    (As an aside, this isn't a theoretical problem. I've heard stories of people getting an order form for, oh, a laser printer. They capture it, change the price of the printer from $499.99 to $49.99 and submit the order. The action accepted it, and when the company attempted to refute it they lost because it was considered a bona fide negotiation since the web site could/should have been programmed to reject forms with altered prices. They made an offer, the client made a counteroffer, and the company accepted it. This depends on your state, etc. Given the current political climate I wouldn't be surprised to learn that this is now considered computer fraud with a 10 year prison sentence.)

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  22. It's a lot worse than just the Vectors by attonitus · · Score: 2, Informative
    There's the use of Hashtable; the non-thread-safe singletons; the traversal of the entire lists in ValidValuesTable.getOption; the unnecessary storing of the options and values string-arrays; the failure to pre-size the ArrayList; the fact that DropDownItems aren't immutable and, my favourite (repeated several times throughout the code):
    private void loadStatusList() {
    Hashtable obj_Hashtable = new Hashtable();
    ListValuesDAO listValuesDAO = new ListValuesDAO();
    obj_Hashtable = listValuesDAO.getListValues("STATUS");
    ...
    I hope that IBM Global Services' QA for products is better than their QA for developerworks articles.