What's in Your HTML Toolbox?
Milo_Mindbender asks: "I've just ended up in charge of cleaning up an old and rather large website created by some non technical people. It has all the usual problems: paragraph tags with no ending tag; mixed case file names that work on Windows but not on a Linux webserver; files with mixed Windows/Linux/Mac line endings; duplicates or partial duplicates of files created when working on pages; and the list goes on. I'm wondering what tools you guys keep in your HTML/website toolboxes that work good for cleaning up this sort of mess. Things like pretty-printers, HTML 'lint' programs, dead file detectors, batch renamers (that change links and the files they point to into OS neutral names), and 'diff' programs that ignore HTML whitespace. I'm particularly interested in batch processing tools that actually fix problems (not just report them) because I've got a lot of files to deal with and don't have the time to edit every one by hand. So what's in YOUR toolbox?"
CAPITAL ONE!
[...]
Wait, what was the question again?
Javascript + Nintendo DSi = DSiCade
Dreamweaver FTW! It would be a huge timesaver in this situation.
Good luck!
Censorship is obscene. Patriotism is bigotry. Faith is a vice. Slashdot 2.0 sucks.
I know many of the geeks out there have forsaken Perl, but it is still, in my opinion, an indisposable tool. I am currently fixing up a website similar to the one you described, especially in terms of the HTML problems. Write a Perl script to fix capitalization, closing of tags, etc. But understand that if code is not written well to begin with, than in many cases, it is impossible to automate the process of fixing it. You are going to have to do some things by hand.
Depending on how bad it is, consider rewriting the HTML and CSS part of the website from scratch. It may be easier than fixing old code.
There are two approaches: live with it and make as few changes as possible, or bite the bullet and do a complete rebuild. To do a cleanup, checkout tidy - it does a good analysis of the existing pages and can generate CSS that is OK, but not beautiful. If you want the final pages to look the same, but be standards compliant, see meyerweb.com and read his books on rebuilding pages ("Eric Meyer On CSS" and "More Eric Meyer on CSS"). Pragmatic is his keyword: lots of examples and he makes sense.
Good luck. You're going to need it.
Been there, try this
I know it's a huge mickey mouse and there's probably (scratch that-- definitely) better ways, but when I need to do repetitive, but relatively simple, that can be done via command line, I use JavaScript to automatically create all the commands, copy them into a batch file, and done.
I use PHP. Server side includes are perfect for standard headers/footers. I check server variables to change behavior based on whether it's on the dev server or the final webserver.
I'd paste an example, but slashdot seems to think PHP code is "junk characters".
HTMLKit has a lot of great options for developers, and a good plugin system.
"Better to be vulgar than non-existent" -Bev Henson
> :argdo:%s/[^m]//ge | w this would remove the funky windows line endings (mind you, ^m = ctrl-v ctrl-m in vim).
:-)
Or, in emacs
M-% (AKA Meta(usually Alt)-Shift-5)
Query Replace: ^M with [nothing]
P.S. Note that ^M is not Caret-M. It is a single character. I usually just copy it out of the file, and then do it in emacs.
My toolbox has a little white pill that I take every time I get a hankering to work with HTML. It fixes me up right quick.
I got my Linux laptop at System76.
The disaster that was "s.gif" (or "trans.gif" in some circles) used as a layout tool was horribly over-used - and the 'net is a worse place because of it. In most projects now, I seek to replace all instances with a "compatible" approach.
.spacer{
I create a class:
line-height:0;
font-size:0;
}
Then I replace all those hundreds (and sometimes THOUSANDS) of references to s.gif with the following:
I use a span sometimes, as required - if the DIVs alone cause layout issues.
Say hello to faster web pages instantly!
How many escape pods are there? "NONE,SIR!" You counted them? "TWICE, SIR!"
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Oops Sorry!
<div class="spacer" style="width:Xpx; height:Ypx;"></div>
How many escape pods are there? "NONE,SIR!" You counted them? "TWICE, SIR!"
Vim, grep, and sed. I heard they make movies, too! :-)
"All you have to do is be fragile and grateful. So stay the underdog." Chuck Palahniuk, Choke
I've used Dreamweaver pretty successfully to clean up a lot of poor HTML since it has pretty good functionality. I don't really have any suggestions as far as other tools go but for general single page cleanup I like DW. I've cleaned up quite a few huge documents that someone just saved as a webpage out of Word and ended up with 2 MB of HTML. Not really sure if that would work for your batch processing needs but if you have excessive issues with single pages I would recommend it.
Really, the only way to do a cleanup of your typical dog's breakfast collection of html is
1. Tidy the pages (using htmltidy)
2. Use a custom written script in whatever language (perl is good) to do as much of the task as possible automatically (things like replacing static headers with includes) - you'll need to be good with regex
3. Open the pages manually, and finish the job - I like Dreamweaver for this particularly if it's a complicated table based layout
whatever the case, it's going to take you a lot of time and energy, there is no quick fix.
NZ Electronics Enthusiasts: Check out my Trade Me Listings
Firefox with the IE Tab (or IE View), Web Developer, View Formatted Source, and HTML Validator extensions.
Vim for the editting, Emacs for the web server, interpreted language, games, database, web browser to check it with, source code management, image editor, vector graphics editor, e-mail client, e-mail server, ...
'Yes, firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'
No, really, stop laughing.
Frontpage, once you convince it to stop the WYSIWTG crap, has three tools that will make fixing a non-technical user's webpage easy. (Never, ever, let a non-technical user use Frontpage without supervision. It's worse than Word.)
I'd be shocked if there aren't better tools out there -- but by and large either they don't do as much, or they cost a significant chunk of change.
(Hey, you, with the laughing -- point me to a app that can do #1 with compatible replacements for #2 and #3, and, er, you'll get good karma for being so mean and laughing.)
TextWrangler or BBEdit Lite, vi, telnet, ftp, Photoshop CS (not CS2), GraphicConverter, Firefox, Safari.
Leave it as a giant tangled mess and secure your job for the next 3 years. When they threaten to lay you off, tell them you need at least 1 more years of work before you can straighten up the code and 'hand off' the job to the new webmaster.
Tidy is great as others mentioned. Will even allow if you feel confident to cherrypick the data you want to scavenge with XSLT.
Separating grain from chaff
A static HTML project has numerous index2.old.html, index2.html, index_2.html, project2.html.old and so on - files that you just aren't sure are useful?
Copy the project directory (touch all the files) and do a wget -r on the tree; by looking at the access time, you'll know all internal referenced files. Alternatively, scan the webserver logfiles to know which files are useful.
Be sure your filesystem is configured to register access times if you pick the first method...
(As a bonus, a close peek on the 404s might give you some answers on mis-used capitalization of filenames.)
Lynx / Links / ELinks
Can be used to dump the text data of old and unmaintainable HTML documents; most useful when trying to scavenge only the text contents to put in a database or so.
First, before bitching about something, you should take a moment to learn about it.
"It has all the usual problems: paragraph tags with no ending tag"
There's no end tag required for paragraphs, as per the official spec: http://www.w3.org/TR/REC-html40/index/elements.htm l
HTML is not XML. Closing tags are optional for some elements, and forbidden for several others. and putting a slash at the end of a tag that doesn't have a closing tag, so it looks "xml-y" is an affectation and a waste of bytes.
Sera
Slashdot, where armchair scientists get shouted down and armchair theologians get modded up.
Previous posts have mentioned Perl and PHP; seconding those for high-intensity search-and-destroy missions. As for software, you can't go wrong with TextPad, WinSCP, and PuTTY.
For best practices (separation of content from structure from behavior, mostly) keep an eye on are listed in and around A List Apart and the Web Standards Project. And if you're looking for several sets of outstanding presentation and behavior tools, check out the YUIBlog and the Yahoo! Developer Network. (Hint: their page grid layout, font normalization, and CSS reset libraries are an excellent place to start.)
Fromdos, grep, sed and awk. Possibly some normal pretty printer too.
Tidy, as others have already mentioned, will be your very best new friend.
Install the 'Web Developer' extension for Firefox, and use some of the HTML/CSS validators in the Tools submenu.
Get a good handle on regex searching & replacing (if you're doing this from Windows, I suggest Funduc's "Search & Replace").
If you're migrating your GIFs to PNG (which I would recommend), then you need to get yourself pngout, to compress them to their smallest possible size (Photoshop SUCKS at this).
And as someone else said, make an empty new standards compliant template, and get to cutting and pasting; it can be a *brutal* initial process, but you'll probably save yourself time in the long run, depending on how clean you want to eventually get the code. If you just want it to be standards compliant, then you can just do a clean up job. If you want to do it 'right,' you'll want to develop a new template and coding style to properly integrate the HTML and CSS. Things like not putting everything in a DIV (a sure sign you're a newbie to CSS), just to style something. Figure out why you should be using H1, H2 tags (& TBODY & TH tags if you're using tables for outer layout), etc, without having to use a lot of unnecessary DIVs all over the place. Inline styles = bad.
Figure out why XHTML may not be the best choice over HTML. Know which DTDs to specify. Know the difference in IE6 between standards mode and quirks mode, and which DTD to use to make IE6 behave. Know that IE7's quirks mode is supposedly identical to IE6's; you supposedly won't get the new 'more-standards compliancy' in IE7 without a DTD.
Oh yeah - the guy who posted about replacing spacer gifs with 'spacer DIVs'? Don't do that to yourself, okay? Yikes.
Learn about usability and readability. Learn about typography, and how light-on-black text should be sized differently from black-on-light. Thinking about grey text on black or grey text on white? Don't be stupid. Make the stuff readable! Learn that sans serif fonts are more easily read at screen density (opposite of print). Learn why Verdana is usually not your friend (go for Trebuchet MS or even Arial).
Oh, and learn to intent your freaking HTML!
Some nice resources:
Activating the Right Layout Mode Using the Doctype Declaration
Quirksmode - a GREAT resource. Awesome info here. Memorize it.
When I clicked the page I was so sure I would see at least one "What's in My HTML Toolbox? vi!" comment, modded Funny of course, but no...
Maybe I should check again later...
You just got troll'd!
CSSEdit by Macrabbit.
Awesome program and worth checking out if you use a Mac.
I like big butts and I cannot lie.
If you're using all static HTML, you can get rid of dead pages with wget. Do "wget www.website.com/whatever -r" to download it, and then just use what you've downloaded as your base.
To find broken links, I like to use Xenu. Google it.
http://ablegray.com
My HTML toolbox, which is my little 64Mb thumbdrive, has really only the bare essentials for website development: Notepad++ and WS_FTP.
It is now.
This is worse than image spacer, please go die in a fire
"The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
A hammer for hitting myself over the head, and a bottle of whiskey to numb the pain... of dealing with HTML.
Game... blouses.
I've used OpenSP a lot. It's a suite of tools that includes onsgmls, the parser that lies at the heart of the W3 validator. Combined with find you can easily validate local copies of all the files. Its faster than using the validator for multiple pages. It also included onsgmlnorm, which is used to normalize SGML. If you have a load of "XHTML without closing p tags" type HTML, change the doctype to an HTML doctype, run it through onsgmlnorm, switch the doctype back, and all the closing ps are there. (It's not quite that simple though - you have to clean up lots of suprious > s which get introduced for sensible but obscure SGML reasons, usually after img elements. It's trivial to do the cleanup automatically.)
no it's not HTML is still SGML, and still alive and well.
"The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
The great thing about web standards is... there's so many of them!
Changing old code to new code could rarely be automated, it's not a simple syntax change, it's aq paradigm shift, and computers are not as smart yet as to figure out the semantics of old code and rewrite it into HTML/CSS combo.
HTML Tidy is something free and available which will do the very basic work of cleaning up and fixing the HTML where possible.
try Notepad++. syntax highliting for html php js and conversion for windows/unix line ending, macros, hex editor, html tidy-ier-upper, and more. Lots o nifty stuff and i's OSS.
Place a curse on Microsoft
Or you could just use the padding / margin features provided by CSS.
margin-top: 1px;
margin-right: 2px;
margin-bottom: 3px;
margin-left: 4px;
or margin: 1px 2px 3px 4px;
padding-top: 1px;
padding-right: 2px;
padding-bottom: 3px;
padding-left: 4px;
or padding: 1px 2px 3px 4px;
q<register> to record a macro, q to finish recording. Execute the macro with @<register>, then you can execute it again with @@. Obviously the @ commands can be prefixed with a number to repeat them that many times, 5@@ would repeat the last macro 5 times, for example.
Game! - Where the stick is mightier than the sword!
I've heard that these discussions can be dangerous... ;)
My main complaint about emacs (I tried it for about a month) was the key structure. I didn't like holding down Ctrl whenever I want to do something - I prefer vim's modal command system. I could see how it could annoy some people, however.
I honestly haven't found the need for particularly sophisticated macros while I'm editing. The . (repeat last command) and ! (pipe) keys have always been enough for what I need.
I'm still learning vim, but I like what I've seen so far.
Although you're right, flamewar in 5..4..3..2..1.. *ducks*
I use Adobe Golive for this, and it's served me well. It detects errors like broken links, and offers batch fixing.
Failing that, perl is probably your best bet.
foo mane padme hum
Err.. this approach just doesn't work. Images are inline elements, you can't replace them with an equivalently sized block element and expect the page layout to be the same. And setting the CSS 'width' attribute of an inline element doesn't work in Explorer, so the entire approach is flawed. Sorry.
As much as WYSIWYG editors some times suck, Dreamweaver is alright. I like that it helps with the organization but also lets me get as geeky as I'd like.
Pretty Pictures!
jEdit (www.jedit.org) - best editor in existance, unmatched functionality
Dreamweaver 8 (on OS X) DW is an outdated way to do things, but it still is very powerfull
Quanta (Quanta Gold for Win or OS X - > http://www.thekompany.com/products/quanta/; Quanta Plus for Linux -> http://quanta.kdewebdev.org/)
PHPEclipse (has anoyances but very good PHP tools)
For a redo of that old site of yours I recommend simply installing a CMS and migrating the content by hand if neccesary. That's probably faster and more effective than anything else. Static HTML just isn't the way to go these days, which eliminates most of the need for a large-type HTML editor. Check out joomla! (www.joomla.org)
We suffer more in our imagination than in reality. - Seneca
I wrote many webmaster scripts to deal with all kind of problems I run into while building and maintaining my sites. And here is a script that many webmasters may find particularly useful, it reduces the size of html files: htmloptim
How is this better than an image spacer? Elements have padding and margin properties, use them!
I will no doubt get replies that "Scripting Language X would be better", but I have the most experience with Perl. So if time was of the essence, that's what I'd use. Perl is a Swiss Army Knife in this kind of situation, and you can easily get just about any kind of blade or tool you might want to deal with files and formatting via CPAN.
You can use Perl to fix the file names, restructure the directories, extract the content, put it into a database, and even drive the new site if you'd like. No matter what the choice of new site software, Perl can salvage the existing content and transform it into whatever format you require.
If I had more time I might choose Ruby instead simply because I like programming in it more. However the choice of ready-made tools via the Ruby CPAN equivalent is somewhat less.
No matter what scripting language you choose, you'll be saving time in the long run. Building tools is always time well spent. Indeed, taking a few hours or even days to write a script that makes a weeks-to-months long job of reformatting take hours is one of the great joys of programming for a living.
Post Scriptum: I'm sure you already did, but just in case: Don't forget to back up the original. Thrice. They'll tell you it's already backed up. That's fine. Make three of your own anyway. If they'll let you, lock one in the safe. "Whenever testing or reconfiguring, always mount a scratch monkey."
NEdet and Firefox & opera
Politics is Treachery, Religion is Brainwashing
On all of my Windows machines, I keep a copy of CSamp running in the systray at all times. It's a tiny little app that will grab the RGB/Hex values for any pixel on the screen. Great for matching colors in images, or if you like me are too lazy to view source and go digging for a color attribute.
Thanks to the War on Drugs, it's easier to buy meth than it is to buy cold medicine!
I honestly haven't found the need for particularly sophisticated macros while I'm editing.
This is why editor-vs-editor arguments get so silly sometimes. People often fail to realise that requirements differ.
Clever use of emacs keyboard macros (and presumably vim too, I wouldn't know) makes a huge difference to a lot of the common tasks I perform. For example one common task is to take an API document and turn it into a class file (missing only the code in the method bodies). There was a time when I used to do that kind of thing manually. That can easily take half an hour. I know some people who'd knock up a quick script to do it (Perl/Python/some shell) - that takes about half as long. Or you can do the whole job with keyboard macros inside five minutes.
My biggest web devel tool is Firefox, with the Web Developer extension and the HTML Validator extension. The former does all sorts of amazingly neat things like letting me get precise info about any element within a page (using "Dispaly Element Information" under the "Information" menu, CTRL+SHIFT+F for short), showing me the HTTP response headers to any given page, add custom styles to a page, validate links, check for Section 508 accessibility compliance, resize the window for simulating lower screen resolutions, and on and on and on!
The latter does instantaneous HTML validation using Tidy and displays any errors or warnings on the "view source" page. It also gives me LINE NUMBERS in the view soucrce window, which is a blessing. The beta version (which I prefer) lets you pick between the Tidy algorithm and the W3C's SGML parser. The SGML parser version gives the same errors as the W3C's own online validator, but without any need to submit the page through an online form.
As for editing HTML, I generally use SciTE or one of its derivatives (eg Notepad2). Sadly, those aren't available under Mac OS X, so when I need to work on a Mac box I use Smultron. THAT, however, is just an editor. People get religious about their editors, so my advice is just to pick one that suits you and ignore anybody what sniggers at you.
Programming smart indent:
/") /p or /table
Initialisation:
$XHTML_COMPATIBILITY = 1
# set to 0 if you don't want XHTML like "
" (simply "
" instead)
define TagEnd {
start = search("= 0) && (search_string(tag, ">",0) == -1)) {
# it really looks like an HTML tag: " (otherwise, comparison operators in PHP are hard to type)
newtag = replace_in_string(tag, "^\\= 0) {
# If this is a tag without content (like
or
if ($XHTML_COMPATIBILITY && (length(newtag) > 0 )) {
# if we want XHTML compatibility AND there really is a tag
replace_selection(tag "
# insert the XHTML end-of-tag
gotoPos = $1 + 2
# as something was inserted, the cursor needs to be moved
} else {
# no XHTML compatibility or no tag
gotoPos = $1
# essentially: do nothing
}
select(gotoPos, gotoPos)
set_cursor_pos(gotoPos)
# reset the selection and put the cursor to where the user expects it
} else {
# a normal tag with a content
replace_selection(tag "")
# insert closing tag - the matched tag (e.g. p or table) ends with
select($1, $1)
set_cursor_pos($1)
# reset the selection and put the cursor to where the user expects it
}
} else {
# it's not an HTML tag - leave everything alone
select($1,$1)
set_cursor_pos($1)
}
}
Newline:
return -1
Type-in:
if ($2 == ">") {
TagEnd($1, ">")
}
Its not mine, I didn't write it but I find I fantastic for working with html. Wish I could credit the writer.
http://michaelsmith.id.au
What for do you got him in your toolbox?
:)
To fix the holes in your code?
--- I am known for the ones who want to find me on the net. Is that a privacy risk or a privilege? One might wonder..
There is a small utility called dos2unix which changes MS-style line endings in text files to Unix style. /usr/bin/mac2unix is symlinked to dos2unix on my Gentoo box, so I guess it can fix MacOs line endings too.
# cat
Damn, my RAM is full of llamas.
You probably should have put this in the "Steve Irwin is Dead" section rather than the "What's in Your HTML Toolbox" section.
Setting his threshold to 5, Sparky eliminated most of the trolls on /.
Dreamweaver FTW!
What kind of advantage would using dreamweaver give you in a situation like this?
I first started with HTML/websites in the mid 90s with AOLPress, then Adobe Pagemill, NetObjects FUSION, GoLive Cyberstudio (which was bought by adobe and turned into GoLive), and eventually, I dropped all of these studio apps in favour of vim using PERL and eventually moved on to PHP.
I've since started using this great app called TextMate, and when I get a complete site that I need to work on, I pipe the code through a handful of PERL programs I wrote to make it readable and make sure all tags are properly closed, then open it in TextMate to start working.
I haven't used any of those big apps (GoLive et al) since the late 90s, so they may have improved since then, but aside from their WYSIWYG aspect and their built-in validators, what other advantages does it earn you? How do those apps aide you when you've got embedded code or PHP or whatever? Do they have built-in interpreters?
I dunno, I've just found that you really need to have a full webserver to properly work on a site. I wonder when Adobe is going to embed apache/php/perl/mysql/etc into GoLive/Dreamweaver to get a proper environment for the previews.
and, I dunno if you can answer this, but how well does Dreamweaver handle Ruby on Rails? I can't imagine it supporting rhtml (erb) or yaml code.
...spike
Ewwwwww, coconut...
I prefer to use vi (of the elvis variety) unless I'm editing a page some a$$hole has used dreamweaver or frontpage to create. I can't stand "^M"! If I'm doing some heavy php work then I use Bluefish.
Having to work for a living is the root of all evil.
http://validator.w3.org/
http://jigsaw.w3.org/css-validator/
Along with awk, sed, vi/pico/nano, and occasionally perl for really complex alterations.
So no, XHTML != HTML.
Perl, especially Template Toolkit, with Emacs takes care of most things.
When I posted, the "Steve Irwin is Dead" story didn't exist. Check the timestamps, my post was posted 1 minute before the Irwin story.
I had to do some extraction of text from HTML, so I wrote a program for it. It may or may not be useful in this case, and it doesn't always do 100% of the job (but I've found the 98% it does do to be very useful). It is (OF COURSE!) open source so you are free to tinker and improve it for your own use. Download from:
http://jsoftco.8m.com/download.html
Teen Angel - a Ghost Story
For batch changing I have found Advanced Find and Replace to be very effective. I had to update a none standards compliant site that didn't use CSS to standards compliance with CSS recently. The site had about 15000 pages at the time, if I remember rightly, but it was quite painless updating it with Advanced Find and Replace.
For HTML, CSS and PHP editing I use TextPad. A great text editor with syntax highlighting and other tools that make writing code easy. For checking the page I use Firefox with Web Developer plugin, Opera (my main browser) and, grudgingly, IE.
I would....
bring all the files into Dreamweaver as a 'site' then as I changed the filenames (i think) DW would automatically update all links to those files.
DW will also report which files are used by no pages in the site. And which pages are not linked to by any pages in the site.
Write some sort of applescript that would open all the files in bbedit and change their line endings (should be simple I think but I've never done it myself).
For fixing the broken html, just run it through one of the many applications based on HTML Tidy. I'm sure there's something automated out there.
If I need to repeat a macro over a number of lines, I record the macro with qq, visually select the lines and hit @. No need to count the number of lines to do the old 12@ way.
.vimrc :vnoremap @ :normal @q
Put this in your
load "linux",8,1
One would assume, seeing as it's 2006 and all, that he intends to rebuild the site as a modern standards compliant site. Even if he chose html 4.01 instead of xhtml it's still best practice to close all your tags.
Keep the old version around to review with... then rebuild the whole thing in a CMS.
- Set up your stylesheet to cover all the examples in the old version... just click through the old site and pick out consistent examples of html entities... don't forget to scope your entities by providing IDs around such areas as menus, masthead, sidebars, advertising, etc.
- Ignore anything that is similar enough to look almost the same, no one will complain if you resolve inconsistencies... but will if you make unilateral decisions like 'All lists should look the same'
- Add in any custom classes... for when 'All lists just aren't the same'
- Hire an assistant with no web experience to copy/paste all the plain text scraped from a browser view of the page into a vanilla Dreamweaver generated html page and save it using the page title as filename.... no links, no formatting... just text. Takes 10 minutes to instruct on this one, then they go do it for a day or two.
- Instruct said assistant to go back and use the WYSIWYG viewer to add paragraphs and select lists and convert them to html lists. Takes 10 minutes to instruct, another day to complete.
- Instruct assistant to go back and add h1, h2, etc where needed.
- You can see where I'm going. Delegate the job in easy to do, hard to mess up, bite-sized tasks.
- While they are doing this you can be finishing up the more complicated pages and adding in stuff like form validation, unobtrusive dom based javascript to replace the horrible Dreamweaver scripting that's inevitably in there... and swapping script based mouseovers for CSS based ones... etc. and setting up all the chunks of html that need to be handled more delicately for accessibility.
- When pages are complete... just copy/paste the final html into the CMS according to your layout requirements for content regions
Essentially I'm saying that instead of using Tidy or something like that which will require you to go back and double check that it's automation went well... use a human equivalent which if constrained to simple tasks will do a much better job.
The nice thing that you get as a bonus is an assistant who knows enough html to be useful but not so much as to be dangerous... and that's hard to come by without paying for a full fledged developer. If that person wants to learn more, great... you can teach him/her the right way and won't have to unlearn them of bad habits. In the meanwhile you can teach them how to make maintenance updates to text via the CMS using FCK or TinyMCE as a WYSIWYG... very easy for making text changes.
A fool throws a stone into a well and a thousand sages can not remove it.
This has got to qualify as the WTF of the Day:
"One would assume, seeing as it's 2006 and all, that he intends to rebuild the site as a modern standards compliant site. Even if he chose html 4.01 instead of xhtml it's still best practice to close all your tags."
HTML 4.01 IS the current HTML (as opposed to XHTML) standard. And some of those bullshit "best practices", like "closing all tags", are forbidden by that very standard. Its not like its hard to read. I linked to the specific page on the W3C site.
So stop being a Microsoft Weenie (yes - you're easily identified by your willingness to break standards, just as FrontPage breaks those same standards by doing "best practice" shit like closing tags that don't need them).
List of tags that the standard forbids having a closing tag: http://www.w3.org/TR/REC-html40/index/elements.htm l
Do you close your image tags? Then you're not in compliance with the published standard. So please spare the bullshit about "modern standards compliant site. Even if he chose html 4.01 instead of xhtml it's still best practice to close all your tags.". You don't know what you're talking about, and it shows.
In case you missed it, the article's title asked what was in your HTML toolobx, not you XML toolbox, or XHML toolbox.
I use Subversion (locally) for my web sites, with hooks to automatically upload changed files on commit.
It saves a ton of time, and probably bandwidth since I can work on a local copy and only upload the changes. (Without having to keep track of which files I've changed.)
So if I were in your shoes, I'd make a local copy, start ripping stuff out little by little until the site doesn't work, rollback the changes until it does, and repeat. Having versioning really helps out if you make a mistake, which is inevitable.
If moderation could change anything, it would be illegal.
But not putting in closing paragraph tags just invites sloppy coding. Always close your tags. Makes it far easier to read, for both yourself and others who may have to read the code, and means you can jump to XHTML without any hassle.
Which you can do now without too much worry.
What a lot of hot air.
It is highly unlikely the poster you are replying to was saying you should close all elements, even those who do not require a closing element, as that is madness.
What they were probably suggesting is that elements that can be closed *should* be closed, which is entirely sensible. It makes HTML far easier to parse for a human, assuming you have reasonably sensible code layout.
vim has a macroing and scripting system. The scripting's no LISP, but for some of us, that's a blessing.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
ry filezilla or Novells netdrive. Makes an FTP server look like a driveletter in windows, any app can use FTP for read and write.
and it means there's less work for the future when he or somebody has to update the site to xhtml or whatever comes in the future that demands properly nested and closed tags.
Mass processing of text files including moving and renaming files and following links?
Sounds like a job for some custom PERL code.
There is no "-1 offended" or "-1 you don't agree with me" mod options for a reason.
What is your criteria for excluding XHTML from the set of valid HTML specifications?
As is XML, therefore XHTML.
The article is about HTML standards. The standard is clear. Closing tags are optional in some cases, forbidden in others. We have enough problems with certain companies breaking standards - we don't need to advocate it here. Sloppy coding is the result of stupidities like breaking standards because you want the code to "look xml-y".
Anyone who can't read HTML 4.0 shouldn't be writing HTML 4.0. Its as simple as that. Don't claim to want to follow the standard, but want to break it because you can't be bothered to learn it.
Here's what the fucktard said in black and white (yes, its Tuesday):
"Even if he chose html 4.01 instead of xhtml it's still best practice to close all your tags."
Not "close all the tags the standard allows for". All tags, in direct violation of the standard.
Whens the last time you saw an image tag pair with enclosed content? A Break tag pair? Paragraph pairs aren't even needed since logically, the end of one paragraph starts another.
If someone can't read HTML, they shouldn't be writing HTML.
Here's the link for XHTML: http://www.w3.org/TR/xhtml1/
And here's how its described:
Extensible HTML is NOT HTML 4 any more than SGML in HTML. XHTML is a superset of HTML. For example, "All dogs are animals, but not all animals are dogs."
Hmm, but that would rely on the macro operating linewise, no? What happens if your macro operates on the next 7 lines and creates 2 of it's own?
Game! - Where the stick is mightier than the sword!
I'd say that XHTML is a subset of HTML. All of the markup in XHTML can be expressed in valid HTML4, but the reverse is not true.
... />
... >
XHTML1
<p>...</p>
<img
<div class="...">...</div>
HTML4
<p>...
<img
<div class=...>...</div>
The XHTML elements are, of course, case sensitive.
XHTML is a grammatically strict dialect of HTML4 with HTML4 compatible semantics. XHTML grammar rules are HTML4 compatible as well as XML compatible.
Things do get a little more snakey when you start using MathML embedded in the document.
* Perhaps to be clear, I should say that when I refer to XHTML, I mean XHTML 1.0. I don't think XHTML 1.1 has gathered enough steam for me to bother with. XHTML 1.1 breaks backwards compatibility in many ways.
** You fell into a fallacy. e.g. dogs are animals, cats are animals, no dogs are cats.
Unfortunately, the topic is HTML, not XML or XHTML, which are both totally irrelevant to the discussion. Additionally, what you or I say doesn't matter - the spec and the W3C, which maintains it, are the final arbiters of what is HTML, and they say that XHTML != HTML.
What next - argue that SMGL is a subset of HTML? Forget it.
Wow... You I bet you really like the smell of your own farts.
JOhn
Campaign for Liberty
Dammit.. I should have hit preview... I'm sure grandparent understands what I meant though. His / Her superior intellect could probably even translate this entire thread into eight different languages.
JOhn
Campaign for Liberty
Re: IE CSS -- Thats IE's fault, not CSS's
What about a Table Cell with a width parameter?
http://www.jedit.org/
I love it - all the features listed above, plus a plugin framework that supports sftp, ftp, and a zillion other tools. (Only complaint is no seamless svn checkin/checkout)
Java - so it runs exactly the same on Windows, Mac, Linux - makes platfor transitions quite smooth.
Not dinging any other tools - just saying that this is an essential tool in my (expanded) toolkit.
But Herr Heisenberg, how does the electron know when I'm looking?
Re: IE CSS -- Thats IE's fault, not CSS's
Yes, but those of us doing real web site development work understand that our results have to work with IE, because a lot of people use it and we have to support anything that significant number of people use. You can't just blindly code to the standards and hope, you have to pick and choose only the standards that actually work for most people.
Table cell widths works fine, in some situations.
Well, you could try a combination of SmartFTP and Notepad++
:)
http://www.smartftp.com/
http://notepad-plus.sourceforge.net/
SmartFTP allows you to edit files live. It ftps the selected file down, you edit it in your favourite editor (automatically launched from SmartFTP, of course), SmartFTP automatically detects that the file has changed and ftps it back up again. The overall effect is that you can hit save in Notepad++ then refresh the webpage to see the changes. A very convenient way to break a live website.
biopowered.co.uk - catalytically cracking triglycerides for home automotive use since 2008. Just say no to big oil!