Efficient HTML Organization and Distribution on Webservers?
rasjani asks: "I recently started working as sysadmin. First major thing i want to rearrange is page distribution. Currently we are using rsync over ssh to copy the stuff to production servers and no backlog of any kind. I would like to implement CVS (or the like) into this scheme, so that if the NOC notices that something is broken in the web, they can do a rollback from CVS if the webmasters or editors aren't around to fix the problem. So people, do you have any thoughts how to implement this? Has anyone done something similar and willing to share their experiences? What gotchas might I stumble upon? Should I still use rsync/ssh for file distribution and add the check-in for a few scripts or should I just make a cron job in production servers to poll CVS for updated material?"
I am not sure exactly how it works, but I know for a fact www.gnome.org and news.gnome.org are done from cvs.
Why not email the webmasters and get all the help you need.
"The poet presents his thoughts festively, on the carriage of rhythm; usually because they could not walk" Nietzsche
I am in a similar position at the moment and if you receive any information directly on this, from the Gnome webmasters, please post it back here, I would be very interested in more information
To set up CVS over SSH, use the following environment variables:
and set up CVS to use RSA Authentication (/etc/ssh/sshd_config)
plus similar for SSH2. This requires that each user and each host have its public SSH key on the CVS server, in their home directory and
Finally, there's some work on implementing SSL/TLS directly into the CVS server, to eliminate the need to provide local user accounts on the server. This should dramatically increase the security of the repositories since it allows them to be turned into closed systems without user shell access. In the most likely scenario, CVS will be able to function much like SSH - you can operate in anonymous mode, or you can require PKI authentication of either or both parties.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
We thought about doing something like this, but found that CVS was overkill for our purposes. We didn't get past the stage where we determined that we didn't actually NEED every old version of our files, just something backed up so we could fall back if we needed to.
So we set up a staging server to which the developers have access, and only the sysadmins have access to the functions to move over files to production. This allows us to keep development separate from production, in addition to providing two sets of backups (one prod, one dev).
The next stage of this is to set up a box with lots of HD space, so we can keep "hot" backups of the html back through the days. We figured out that the backups for the various web servers would come to something like 9GB/wk. The boxes we eventually went with are attached to NAS boxes with 240GB each (mirrored, so functionally 120GB, which lets us keep roughly 3 months of backups on a hard drive, as opposed to tape.
Note that you could easily set up a *nix box with 300GB of space relatively cheaply. The processor and memory requirements are going to be practically nil, so we figured that it would be about $1200 for the box. This is beige boxed, which my boss threw out since we couldn't get HW support, so our system cost significantly more.
When we need to restore older files, we can just load up the old tar.gz, copy the files as needed, and we're outta there. No worries about tape drive screwups, and so on.
Hope this helps.
ceci n'est pas un sig.
Check here:Using CVS to manage a website
Is that you?
Doesn't this belong on slashcode.com?
Jesus was all right but his disciples were thick and ordinary. -John Lennon
...a content management system (CMS).
I work in the CMS group at a large tech company. (Key word: large.) We use ATG and Documentum to form ours, but there are many others depending on your needs... Interwoven, CVS, etc.
Here are the major features you should have in this system:
This may sound like major overkill, but trust me, it's not. Put it this way: if you implement a solution using CVS (command-line tools) and rsync, you've just created a barrier to entry for publishing on your site. You want the marketing people to be able to push their cute little Flash/PPT/PDF presentations out NOW without having to log into a command-line system, and you want those same marketing people to do that without having to know anything besides Flash/PPT/PDF. You want publishing on your site to be easy and straightforward so that you, the sysadmin, can focus on the backend stuff without having to deal with marketing whining that they can't seem to get their new PDF on the site.
Spend the extra money and go with a content management system from the companies that do this for a living, and then you can rest easy and do the things you really want to do in your job while letting the website content manage itself.
Use RPMs. Store your doctree in CVS and have a script/makefile that will do exports and create a CVS image from it. Do the same with code so that static and dynamic code both get proper, versioned treatment. This makes it very easy to revert to known good configurations, etc. The downside is you have to have root access on the web server. I can speak from practice in saying this works quite well, especially if you have staging/QA environments to test on. Simply roll the RPM, test it, and if it passes, push the same RPM live.
We have multiple instances (dev & prod) running on the same server using a NetworkAppliances througn nfs. Two nice things about this are: .snapshot/hourly.0 or something) it keeps a few hourly,daily,weekly, etc.
1) We can also mount the netapp on windows.
2) It automatically takes disk snapshots, which are very easy to access. (just cd
I'm not trying to be a salesman, but we love our netapp.