Load List Values for Improved Efficiency
An anonymous reader writes "Reduce the number of database hits and improve your Web application's efficiency when you load common shared list values only once. In this code-filled article, learn to load the values for drop-down lists when your Web application starts and then to share these loaded list values among all the users of your application."
When telling us there's code... TELLING US WHAT LANGUAGE. It's Java.
Interesting article, but preloading those values will invariably lead to out of sync conditions when the backend changes. Nothing mentioned in the text as to how to cater for that eventuality.
-- The problem with troubleshooting is that sometimes trouble shoots back.
What's the point? Since when is Slashdot a forum for random tech tips (and not very thrilling ones at that)? Did IBM pay to get this posted? Is Slashdot trying to make fun of IBM by actually posting it?
"And people, I can't stress this enough. Put your garbage in the garbage can."
"Garbage in the garbage can. Hmm. Makes sense."
I'm a big tall mofo.
This just in! Caching frequently-used data yields performance improvements! Film at 11!
Is this April First?
Wow, next an article about using "for" loops? The benefits of "bubble sort"? "Binary trees"?
D.O.U.O.S.V.A.V.V.M.
Keeping frequently used data in static singletons, who would have thought it!
Seriously, this is probably good advice for someone just starting out programming, but I would expect anyone with any experience at all to know about this. Its hardly a revolutionary new technique.
sheep.horse - does not contain information on sheep or horses.
Well thank you captain obvious.. I've been doing this for years with ASP. Just load the contents of the listboxes into the Application object.
In ASP.NET you can even do cache invalidation when the database changes. Simply create an extended stored procedure that's fired when any of you update/insert producers run that write to the changed record ids to a Queue (using Microsoft's Messaging and Queuing service) then have a thread in the ASP.NET process that periodically check the queue for new messages and clear the values that have changed out of the cache.
Because the Queuing service works across networks it's a really neat way to provide scalabity in web applications - if you can't wait for SQL 2005 which will provide cache invalidation on database updates as standard.
Simon.
With ADODB for PHP (and perl http://adodb.sf.net/, you can call CacheGetAll, CacheExecute, etc... The query resultset is saved to a temporary file. This avoids having to create the cache within the same function you would normally call without having to write extra code.
[first useful post?]
Marques Johansson
Were's the patent?
No need to post anything else in this thread! ROTFL
At least partially.
So, you can save result in a file, but can yo usave them in memory too, using PHP?
I'm still trying to figure out what people mean by 'social skills' here.
and of course smarty.php.net is there too !
Chris ,
Php Programmers.
In init(), there is
..
ListValuesLoader listValues = new ListValuesLoader();
but no global.
However, I can't find the object where the data is actually cached. Is it a Singleton somewhere? Pretty well hidden it must be
I'm still trying to figure out what people mean by 'social skills' here.
For additional style points, load all lookup tables in one query, and concatenate the HTML into the values.
But I may be the Kurtz of SQL...
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
If you have yet to read the 25-page FA, may I present the precis:
The simplicity of the point however is lost in the complexity of the article. It covers web.xml settings, servlet classes, list value loaders, persistence backends for said loaders, data source 'helpers' for said loaders, custom object classes for the loaders, several subclasses for said object classes, and a jsp page (to boot). Phew.
The author refers to this design as a quick and easy approach. It is not. If you are not familiar with Java and read this article, please do not be put off. He could have demonstrated the point with a far simpler example. E.g. static variable, sql statement, jsp code, done.
[The author] has worked with IBM Global Services for one year, and has five years of experience in J2EE-related technologies. And it shows. I dread to think how much a fully realized IBM Global Services project would cost should all its consultants apply this sized sledgehammer to each small task. Hopefully the article was not written up on the client's dime as well.
--
Java Hosting
How is this news? This is a random tutorial article from the net. Not newsworthy at all.
A Data Access Object should have 1) an interface (defining add, remove, retrieve, etc.) and 2) a standard implementation of the interface that reads/writes to the database on every method invocation.
A Decorator can implement the data access interface, delegating all method invocations to a wrapped instance of the standard implementation. Decorate the behavior of the standard impl. by providing a cache, checking the cache before retrieving a model and updating the cache before saving a model.
Because the standard impl. and the decorator share the same interface, you can have a factory create instances for you. Your code doesn't know or care which instance it is using. Mix and match Decorators to your heart's delight. A logging Decorator (track what data is being access, etc.) can be thrown into the mix, and again your calling code wouldn't be the wiser.
This pattern is easily unit tested and load tested. It doesn't require a running web container to test or run. It Just Works(tm).
It's called caching, and it's been done since people had to load in commonly-used external references.
I've not RTFA, so perhaps it's truly excellent, but why the hell has this been posted? Anyone who's writing any sort of application and not making intelligent use of caching is either really junior, or should probably be looking for a new job.
It's official. Most of you are morons.
What? and give us all the details so we don't have to rtfa.
Like we need that stuff anyway to make a post!! neh eh!!! *pokes out tongue*
Sheesh. Sometimes I think IBM consultants still live in the 90s.
This is standard procedure and any good JDO implementation will do this for you.
http://en.wikipedia.org/wiki/Java_Data_Objects
Laugh while you can, people. Because Srinivasa Rao Karanam is training your replacement as we speak.
wow, ibm advert. Nice /. - how much are they paying you to advertise in your stories?
Comment removed based on user account deletion
Or you could use something like Memcached. Works with pretty much any language, and tonnes less hassle. (Thanks LJ ;))
Last.fm - join the social music revolution
Seriously, I've you're into servlets and you haven't been caching non dynamic data, shame on you.
By same token, if you're on of those twits who caches five years of data warehouse information in the application layer, there's a special place in programmer hell for you...
Memcached. I think you'll find that is actually the way it should be.
As the subject :>
...this "Lead Developer" at IBM will discover the wonderful performance benefits of pooling database connections, rather than open a new connection every single fricking time he hits the database. No wonder he saw a massive performance increase when he learned how to cache the lookup values.
And what's up with the "obj_" prefixes in some of the code listings? Newsflash: Java is an object-oriented programming language, and most competent Java programmers can figure out that a variable called resultSet probably isn't a primitive. You don't need to waste another 4 characters pointing out that fact.
Wows thats such genius, it will revolutionise programming. Quick patent it!
of course, i immediately made a base case, trapped the fowl and filled my belly (to put into practice such sound advice, you see ;-).
mmmm, cs stew...
Make a virtual hd? (memdrive or whatever its called).
/dev/md on BSD) to store sessions in files, because they are properly locked."
."
It seems that php docs state:
" Optionally you can use shared memory allocation (mm), developed by Ralf S. Engelschall, for session storage. You have to download mm and install it. This option is not available for Windows platforms. Note that the session storage module for mm does not guarantee that concurrent accesses to the same session are properly locked. It might be more appropriate to use a shared memory based filesystem (such as tmpfs on Solaris/Linux, or
Somewhere else it says:
"To use shared memory allocation (mm) for session storage configure PHP --with-mm[=DIR]
You can probably google to get more info, but it seems that mm is a viable alternative.
Any web developer who'se picked up a book for PHP/ or NET Web Development would know how to do this in Apache or IIS.
Woah, you can cache objects in memory? You don't need to go to the remote database to get a list of states names? Holy crap!
A *terrible* language designed not by a computer mechanic but by ignorant math professors. Horrible. Worst part is that the new generation of programmers thinks they should do everything in SQL.
See LK discussion and linus' reply:
>
> Why not to use sql as backend instead of the tree of directories?
Because it sucks?
I can come up with millions of ways to slow things down on my own. Please
come up with ways to speed things up instead.
Linus
But, have you actually looked at the crap most open source projects turn out? (Certain ones are turning out incredibly high quality works of art; most, like slashcode, are crap.) I think this is actually a real hint to the majority of slashdot readers who are completely clueless.
What's the point of all these fancy new RAID arrays and such if we developers can't get increasingly sloppy with our coding practices?
Though I kid, I find that populating drop-down boxes for applications are pretty minor in the grand scheme of things. My performance issues tend to stem around doing cross-server database queries involving tables with millions of rows. That's when things get slow.
I'm sure there are some people who have to do this level of optimization - folks like Google and Amazon. The only place I've had to deal with this sort of thing is with things like our Intranet home page. In that case, all the drop-down boxes became JavaScript includes that were dynamically recreated every time the database changed. It went from an ASP-based page to completely static, taking considerable load off the server.
Well, the complexity of the problem made me think it would be VBS. This is the kind of example you can found in 'Advanced VBS' (or even Expert level). ...
Oh wait, you were talking about coding language
Sig (appended to the end of comments you post, 120 chars)
After bracing myself with some more coffee, I read a bit more of this article.
Bad, bad, bad.
What's with the Vectors, anyway? I haven't used those in years.
It was a joke! When you give me that look it was a joke.
Slashdot editing has been outsourced to India.
That was not for the run-of-the-mill programmers. That was for quiro in californian school.
The proposed solution is not rocket science, but as the examples suggest, there can be a fair amount of code to get there, which can sometimes be tedious to implement (especially when having many such lists).
/ home.do or http://sourceforge.net/projects/esslibraries). Its purpose is to automate the (re)loading of such lists as much as possible. Configuration can be done through XML files. The library is extensible and not limited to databases and fully supports I18N data (something the article does not mention).
One may want to look at automating this process using a product like the (open source) Lookup Loader library (http://www.esslibraries.com/ess/libraries/lookup
Wow, if you can publish an article about not loading the same data over and over again, then I've got some work to do!
How about an article on how to make an application run the same operation over and over instead of repeatedly executing it from a cron job?
Or an article about keeping your strings in a string table instead of dispersing them throughout your code?
Or perhaps I could write about how to use cut-and-paste to aid in producing Javadoc instead of typing @parameter over and over again?
Is this jackass for real? Publishing this garbage is almost as bad as filing for patents on trivial extensions of ideas. I hope the author at least realizes the only good to come from it is to pad his resume.
You are checking your backups, aren't you?
Vector obj_VectorOptions = new Vector();
What's with the mixing of Hungarian and long variable names?
I've been programming Java for 8 months and even I can tell this is a waste of space.
Lame, this is basic stuff. Give me a break.
The solutions provided in this article have very limited usefulness. I support the inclusion of a good web efficiency article on slashdot, but this article is a waste of time and its techniques are a waste of energy unless, for some reason, your site uses a ton of dropdowns.
I read another article on IBM once about web efficiency that was much more thourough and much more slashdot worthy, but I can't find it unfortunately. When I do, I will post it.
Could Jesus microwave a burrito so hot that he himself cou
...no wonder the site is so slow.
- Anonymous
Where did anyone present this as a "Revolutionary technique"? I don't see that as being said or implied anywhere. It's a how to article for people who are new to Java, with a (nearly) complete end to end example. Should it have warranted an article in /.? Perhaps not, but your crticisms of IBM consulting services, based on a how to article for new Java developers, seems to be quite a leap.
If the picklists are at all updateable while the application is running, he can cache as he does, but he'll a mechanism to invalidate the cache and re-read from the database.
Forgetting that for a moment:
- Hungarian notation is NOT necessary in Java. Period. End of story.
- No one uses Vector anymore.
- There are some nice tag libraries, so STOP PUTTING JAVA CODE IN JSPS!
If you're a code monkey, memorize these two quotes:and
(quotes from http://www.vanderburg.org/Misc/Quotes/soft-quotes
"Almost all programming can be viewed as an exercise in caching."
.sig, and this article is just a pretty bad example:
For the last ten years or so, this has been my
Yes, you should probably cache some or most of this stuff, but you should also consider which of these lists might contain dynamic data:
Is it enough to reload the list at application startup, every N days/hours/minutes, or would it be possible to send a signal from the updating process to trigger a reload only when needed?
Terje
"almost all programming can be viewed as an exercise in caching"
I really hope this is a joke...
/**
* Returns the boolean value.
* @return boolean
* @param option java.lang.String
*/
public boolean isValidOption(String option) {
option = (option != null) ? option.trim() : option;
return optionsVector.contains(option);
}
The namings are horrible; If I'm using a variable I don't need to be reminded of its type. I certainly don't need to be reminded that my options vector is an object. I do need to know more about a method than its return type. And just to whine, I don't need to set the value of option to option if it is null and certainly don't need to check if my Vector contains it.
How much wanking about could you possibly do to make a cache.
Thanks for the hot tip... I'm really anxious to see how the world changes now that the concept of the cache has finally been brought to light. Behold, sarcasm, I've invented sarcasm! Wow, maybe we should create a /. article for me?
The son of the PHB: My dad always told me to eat the last pickle before drinking the pickle juice.
Dilbert: Something something something!
The son of the PHB: Do you have any idea how much it hurts if you get a pickle in your eye?
Hahahahah! Well, I guess you had to be there.
Point is, the article should have been in developers, not the main page.
The Internet is full. Go Away!!!
Although saving data in memory is useful, note that files get cached by the OS. Session handling in PHP is using files, too, and is "just fast enought" (TM pending).
Come on. How can you call this concept even novel. Too much data to load from a slow process, so load it up front and cache it. I don't think you could call yourself a geek if you haven't bumbled into that one.
Come on guys, you can do better than this!
This is my sig.
I agree with other comments. This is pretty basic, but then again you can't beleive some of the stuff people do SQL queries for. I started at one job, took over a web app that had a lot of forms with a "STATE" drop down. Every time they were going to a ref table to build the drop down. I gave them unbelievable crap.. "when's the last time we added a state to the U.S.?"
The point of stuff like this is, you can keep a dynamic page to generate the html and just make the damn thing an include. You can always regen the include if a reference table changes. Or build something fancier to check if data has changed and regen.
DO NOT DISTURB THE SE
The guy who wrote this just wanted to show a nice concise summary of different things you can do to cache static data on startup but still keep it in a DB.
So please direct all flames (A) to whoever posted this as "News", and (B) to whoever accepted this "story".
Or perhaps Hemos said to himself "Damn, a lot of sites sure are slow latley. Perhaps they know not of caching." and then posted the story.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
The number of database hits isn't the whole story, and is really not all that important. My web+database apps are LAPP driven, which provides some help with caching frequently used data (such as drop down lists).
Linux uses all otherwise unused memory for file caching. My database machines are loaded with 3-4GB of RAM each, so there is lots of memory used for caching.
PostgreSQL uses shared memory caches to increase performance (though I don't know the exact nature of its shared memory usage), and it works very well. Well constructed joins on multiple tables with millions of rows each, with reasonable filter criteria, return instantly.
My PHP applications populate (on average) 8 drop down lists per page, after determine access rights, user preferences, etc via other queries.
When accessed from our LAN, page construction is instantaneous. Rather, page construction is limited mainly by the speed with which Firefox (and Konqueror) can render HTML. For all intents and purposes, it's instantaneous.
Many lists frequently change, so caching at the application level is probably not going to provide any benefits. Multiple database hits aren't necessarily problematic if the database and operating system's internal caching already does a good job.
Hate to see what happens when he is feeling creative
He needs a bigger moustache to hide behind.
* feel free to substitute VB here, as I've worked with young programmers who only learned VB in college for whom this approach is beyond their comprehension. Their normal solution is "More RAM".
this is getting old and so are you
blog
that's the only reason I've used Vector, aside from the minor synchronization distinction. Why doesn't ArrayList have setSize()? Anyone know this?
Hey, I'm just your average shit and piss factory.
You can do the same thing (caching frequently accessed data for dynamically generated pages) in PHP using the shared memory library. (SHMOP?) In regards to the sync issue one /.'r mentions below, you can poll for a TS that indicates the underlying SQL data has changed.
The reason you would maintain "static" data in the database is for data integrity.
For instance, the list of US states and Canadian provinces does not change very frequently. I think the last Canadian change was about a decade ago, the last US change nearly 50 years ago.
The "full name, abbreviation" name used in most pulldown lists (full name as label, abbreviation as value) can obviously be considered static.
So why keep it in a relational database? Simple - you can use it to provide referential integrity for all "state" fields in the rest of the database.
This isn't a huge deal with states, but it can be very important with domain level enumerations. Your form actions may be well-behaved, but a robust system must account for clowns who feed their own data directly into your action URL.
(As an aside, this isn't a theoretical problem. I've heard stories of people getting an order form for, oh, a laser printer. They capture it, change the price of the printer from $499.99 to $49.99 and submit the order. The action accepted it, and when the company attempted to refute it they lost because it was considered a bona fide negotiation since the web site could/should have been programmed to reject forms with altered prices. They made an offer, the client made a counteroffer, and the company accepted it. This depends on your state, etc. Given the current political climate I wouldn't be surprised to learn that this is now considered computer fraud with a 10 year prison sentence.)
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
... who reads Reduce the number of database hits
as 'increase the number of database misses'
rather then 'reduce the number of database accesses'?
Yes folks.. when early humans realized that they needed information quickly they would write it down. Thus removing the need for sometimes long response times searching the brain.
This is news?!?
This should be read: "Do not use." Unsafe concurrent access is a great way to introduce critical bugs that are only found after the app is deployed to production servers. (Just ask any ColdFusion developer who has used sessions in CF 5.0 or earlier.)
I wonder if Slashdot does caching of frequently posted stories.
when presenting an HTML table, with a big drop-down in each row, is there some way to send that list to the client only once instead of numOfRows times?
who is she? leave a comment!
"How to Use Semicolons To Avoid Compiler Errors"
Rule 1: Never do anything you don't need to. (run time optimization)
Rule 2: Never do anything twice. (cacheing)
Rule 3: Empty cache that won't be required. (space optimization)
If you design properly, your software will run fast, stay fast and small.
Remember "Dr. Dobb's Journal of Computer Calisthenics and Orthodontia (Running Light Without Overbite"?
Its just as true now as it was back when you were writing for machines with K (not M) of RAM.
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
then you're a really crappy web app developer
Well much programming can be viewed as an exercise in data compression. You take lots of requirements and constraints with lots of redundancies and overlaps and compress it to code.
;).
A smart programmer can do very good compression. A wise programmer will do it so that you can add likely stuff later and recompress without taking too much time (perhaps at the expense of lower compression).
Of course if it's bad programming it's an exercise in data loss, corruption or explosion
I have to wonder if the real purpose of this article's presence on Slashdot is to demonstrate how pitifully primitive the average programmer in the commercial world is.
If you're still in school, you may not be aware of this, but a huge percentage of people who program for money are no more competent than the article's author. Many are just discovering these basic (no pun intended) programming techniques for the first time.
This tells us two things:
1. You can get work even if you barely know a compiler from WINMINE.EXE.
2. If you choose to become a truly competent programmer, you will kick ass in the commercial world. The need for such people is great.
The Internet is full. Go away.
Wow. Hoodathunk you could actually store data in an application's memory space?
Next article: How to persist cached data by writing it back to the database.
Underscores for instance variables? Never heard of that one, and I've just checked The Elements of Java Style and can't find any reference to it.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
I hope that stories like this and the one about the "Juicer" that isn't, don't become common... lame. What's going on? Is there just not anything interesting coming in today, so they decided to fill some space with some boring stories?
This is just typical of your large corporate sweatshops. Think "Accenture", think "Avenade" (partly owned by Accenture) think "IBM GSA". Managers get pitched the premium sizzle for the IT steak and get lumped with a shed-load of graduate programmers that think the old farts, that take too long to do something, really don't know how "the new way" of doing things.
I think this problem is a DATABASE problem. If the query is common enough, try caching it in memory on the database server. Something like the mysql query cache is your friend, not hundreds of lines of code!
Get rid of everything Micro and Soft: Buy Viagra and/or Linux
The article title sounds as if the author has figured out some amazingly cool trickery to avoid database hits, in reality the author is making a very specific monolithic mediocre serverside cache (not even a mediocre distributed cache or a clever highly performant local cache). The title of this article probably should have been "Populating HashTables using bad JDBC".
How did this get posted on Slashdot? How did this get posted on IBM? How did this get published at all? Why, oh why for the love of all that right and good, does this guy still have a job?
Ugh!
With Smarty, and PHP, you could put the list in its own cached template. Check to see if the template is cached before displaying it and if not go to the database.
You can even use Smarty's html_options function to build your lists for you.
And when something changes the data for the list, just clear the cache. The next time it's needed the cache will be updated.
We run an off-the-shelf application that handles this problem nicely. Their database layer has a query object that takes a "cache timeout" value for each query issued.
This layer replaces all whitespace in the query with single spaces, then hashes the SQL query, and stores the results in the filesystem (as XML or plain text, depending on the nature of the result set(s)). The hash becomes the filename (e.g. "704A94757A794ABB8DEFCB48E86B97B1-cache.tmp"). Every time the same query is issued again, the cached file is used instead of a database connection, until the timeout age of the result set has passed. After the timeout has elapsed, the query is re-issued to the database, and the results are re-cached.
Now, this application has everything stored in the DB - the position of controls on forms, application logic, permissions, etc. This new caching system they recently added drastically reduced the load on our DB servers. By 50% or more. But all the flexibility of having DB-defined forms and workflow is still present.
This app is built in C# on .NET, and I'm not sure if these kewl caching features are built into ADO.NET or not. This is the only .NET app I've ever seen built with this sort of "universal" DB caching. But I think it's a very cool way to make an efficient but still entirely DB-driven app.
...that can be automated.
.php file that has the menus as .php code, and that file is included whenever the menu is used. No database access for pages that need the menus, no restarts necessary whenever the semi-static data is modified. Pretty much a win-win for all involved.
For instance, on some sites that I've developed, whenever an admin page is used to add to menus, it *does* update the DB, but it also recreates a static
Or did I miss the point of the article?
The more likely situation is that the Java fanatics got carried away and generated every page dynamically, whether it needed to be or not, and as a result, the server is choking.
While this is obvious to anyone who bothers to actually care about design... it isn't so obvious to everyone. Perhaps I should explain.
For the past year or so I have had to come into a team and do hardcore J2EE development (not a problem I enjoy it). But the older developers are hopelessly clueless and out of date. (Fortunatly there are younger, saner heads... but age is a red herring.) They are under the impression that because they are using an Object Oriented (really object aspect, I believe) language, they are doing OOP.
This attitude basically results in things like this: JSPs that contain thousands of lines of Java, connect directly to the database (no pooling and right in with the HTML), cut and paste SQL code (the same code copied everywhere), and they all post and repost to themselves to do processing. They don't create any actual useful objects (say an Abstract Data Type) to encapsulate data. They think design patterns were just cooked up last year.
They are so far from understanding OO that when I mentioned something on Booch's books (Object Oriented Design Using Examples) where he states that:
"a concept qualifies as an abstraction only if it can be described, understood, and analyzed independently of the mechanism that will be used to realize it"
(I was trying to explain why a Servlet was an object, but that just using it did not make something OO even thought they tell management that they are all OOP experts.) They look at me like I am crazy and one of them even said that he thought that Booch didn't know anything about OO (ok alot of them are gonzo over the cancer that is PowerBuilder and those types usually have no clue about design or anything).
They are under the impression that everything I know about Software Engineering I learned in school (it is very hard to explain to them that I had a more theoritcal education, including grad school). So when I got to work away from the old guys doing an app the right way (Entity beans, SessionFacade, tiering, MVC, etc) they got mad and refused to use Java anymore (and I work at a very big company).
What is the point of this rant (and believe me there is much more, these guys are in the stone age)? Well, articles like this one may seem blatantly obvious ideas to /.ers, but to people wh never bothered to learn anything in their twenty years of development want you to hold their hands through everything. Age really is a red herring (I know alot of developers and many of them are much older than me and introduced me to some of this stuff... age has nothing to do with it).
This is a stupid article to put on /. but it exists for those old guys who never bothered to pick up a trade rag in the last twenty years (these guys spend most of their time oogling press releases from vendors).
I recently wrote a cool search caching thing for a web application, to avoid re-allocating memory for the search results every time. When I was done, I looked to see how much memory I was saving: a cool 10k per request. Wow.
Given the potential for bugs in the code, I should probably trash the whole thing and replace it with an simpler implementation, but having spent 2 hours writing it I haven't the heart.
The moral of this story is what we all already know: avoid premature optimisation. The bottlenecks in your application aren't going to be where you're expecting anyway, so just choose the simplest implementation that could possibly work. If it turns out to be too slow or inflexible later, then fix it later.
fish and pipes