Tuesday, July 01, 2008

Searchable Flash - some early tips

Adobe announced this morning (I am in Zurich) that it is has worked with Google and Yahoo! to improve search engine capability to reach inside of the Flash (SWF) file format. You can read the FAQs here.

So how does it work?

This is being done via a headless Flash reader that can extract strings from Flash and use them to build an initial page ranking and string array for each page. The string arrays themselves are then fed into Google and Yahoo! ranking algorithms where they will give the page an initial ranking for each tree. Exactly what is indexed and what rank the search engines will give it is yet to be seen. Don't automatically assume everything in SWF is now at par with HTML text.

For example, a large vector-based drawing of the word "Washington" will not likely make you rank high as it is still not text. Cognitive capabilities are not easy to bestow upon machines (which is why Captcha works so well).



Why you might be excited


Without doing anything, the engines will use some of your text values in SWFs and possibly adjust your rankings for certain terms. This is a good thing for people who have not yet learned how to get Google and Yahoo! to use static content for their indexes. In cases where your unique text content has previously been inaccessible, there should be improvements.

Why you might not be so excited

Initial page ranking is not the silver bullet it once used to be. As soon as searches are performed and your content comes up, both Google and Yahoo! will still dynamically adjust your score based on a multitude of conditions. I have outlined several of these tips in this blog post already. Duane's World Episode 3 also has some tips and tricks on how to use Google's dynamic page ranking algorithm to move up the ladder.


Some tips and tricks:


Flash developers who took care have had their content indexed all along. Using XML or XHTML data providers with strings in them and a link from the index.html shell was a great way to get stuff indexed and still generate good page rankings. From your main SWF, you simply just used the XHTML file as a data provider and parsed it with E4X (ActionScript 3's XML parser). That way the same raw data was both indexed by search engines and also used by the application. Those developers who took this care might now be trumped by those who will get lucky and have high rankings.

Flash developers, like HTML developers, still need to understand that the pages that point at your content have a lot to do with your dynamic rank. Content is still king too. Make sure you have your keyword well researched before betting the farm on it. For example - how many people performed unique searches on that term? Is it an unambiguous term?

This will provide more relevant automatic search rankings of the millions of RIAs and other dynamic content that run in Adobe Flash Player. Moving forward, RIA developers and rich Web content producers won’t need to amend existing and future content to make it searchable — they can now be confident it can be found by users around the globe.

One of the best ways to achieve a higher ranking is to examine other Flash sites and see what they have done. Sombrio is a good example. If you search for this in Google, the Sombrio Clothing Company comes up #1 out of about 2 million.

http://www.google.com/search?source=ig&hl=en&rlz=&=&q=sombrio&btnG=Google+Search


Note that if you read this, then click on Sombrio Cartel, you are actually helping maintain it at the #1 place. Google will see the above string as a search in true REST style, then track what you click on, bounce rates etc.

Peter Elst also noted some additional concerns here. He wrote:

The concern I have here is that URL requests to the backend will get indexed, those URLs getting exposed in search queries or spider bots hitting those URLs could cause issues. Its not like in HTML content where the search engines can ignore form submit URLs, there is no such context in a HTTPService or URLRequest.


True. Once again, developers owe it to their clients to learn more about how the systems work. Note that Google currently does index some content and you can try this search by clicking here:

http://www.google.com/search?hl=en&lr=&as_qdr=all&q=sombrio+filetype%3Aswf&btnG=Search


More on this later. I am going to try some experiments to see what is possible.

No comments:

Post a Comment

Do not spam this blog! Google and Yahoo DO NOT follow comment links for SEO. If you post an unrelated link advertising a company or service, you will be reported immediately for spam and your link deleted within 30 minutes. If you want to sponsor a post, please let us know by reaching out to duane dot nickull at gmail dot com.