Adobe Makes Flash Crawlable
nickull wrote in his journal that "Today Adobe systems made an announcement that it has provided technology and information to Google and Yahoo! to help the two search engine rivals index Shockwave Flash (SWF) file formats. According to the company, this will provide more relevant search rankings of the millions pieces of Flash content. Until now, developers had to implement workarounds for exposing text content used in Flash to search-engine spiders and other bots such as using XHTML data providers. While the Flash content is exposed, it is not yet clear how it will be utilized by the search engines, as they have not revealed their algorithms. The SWF specification is openly published."
Amazing what a little competition will bring...
I'd be much happier if the search engines quit linking to flash-only websites completely. Then maybe those horrible things would go away.
I can't think of any case where I've seen a Flash-only site where Flash added anything of substance (cuteness doesn't count), and they tend to be hard and non-standard to navigate, break key bindings (like CTRL-T to open a new tab doesn't work if mouse is over Flash), etc.
Here is an example: A business association's website was redesigned in Flash. Instead of their staff page having a simple list of photos, names, job titles and phone numbers that you could search by hitting CTRL-F, the flash version just shows a photo of all of the staff members and you can only find the job titles and contact info by holding the mouse over the appropriate person's photo. So, if you want to find the contact info for the newsletter producer and you don't already know what he/she looks like, you have to move your mouse over each of 15 different photos until you find the right one. Stupid. There is just too much dumb stuff going on with Flash.
For a start, "crawlable" does not mean it WILL be crawled. More likely, most flash will contain nothing but junk and internals that were never meant to be seen anyway. I wonder when the first "we recovered a password that was stored inside a flash file" / "we googled for vulnerable flash apps and found these" hits will come about. And, as someone's already pointed out, if you *can* extract the text from them, you can't do much useful with it besides say "it's in this Flash somewhere". You can't even do "find in page" once you've clicked on such a link. And if it's at the end of an hour-long Flash animation, you're not going to sit through it.
Then you'll have some people who have actually used bitmaps instead of text inside the Flash for various reasons, etc. The only useful thing to come out of this may well be a "View as HTML" version of Flash-only pages. But they will still be second-class pages because the designer didn't want to do it theirselves.
Given that people who use Flash aren't exactly the most popular people in the world (e.g. if you want it to appear in Google, be read by people, to be bookmarked, to be quoted/cited/linked etc.), this won't affect much - Finding content in a Flash file is like looking for a needle in a haystack. That's the problem solved by this announcement. However, finding *useful* content in that file is going to be even worse, and actually getting users TO that data will be almost impossible.
I imagine that the same thing will happen as it did with images, PDF's, etc. Those who design their Flash well will get something indexed and it'll actually get a hit or two from "View HTML Version" on Google. Those who don't (i.e. 99% of the people who make them) won't see any difference at all.