Generally, search engine are text-based. Most search engines only have the capability of indexing text content however Googlebot’s capabilities has been successfully enhanced over time.

Google can index a plethora of file types, the most common formats include:

  • Shockwave Flash (.swf)
  • Microsoft Excel (.xls)
  • Microsoft PowerPoint (.ppt)
  • Microsoft Word (.doc)
  • Adobe Portable Document Format (.pdf)
  • Adobe PostScript (.ps)
  • Atom and RSS feeds (.atom, .rss)
  • Autodesk Design Web Format (.dwf)
  • Google Earth (.kml, .kmz)
  • MacWrite (.mw)
  • Microsoft Works (.wks, .wps, .wdb)
  • Microsoft Write (.wri)
  • Open Document Format (.odt)
  • Rich Text Format (.rtf)
  • Text (.ans, .txt)
  • Wireless Markup Language (.wml, .wap)

There are millions of websites which use Adobe’s Flash technology in one form or another and back in the day Google’s guidelines explicitly advised that any Flash content without alternate textual content i.e Scalable Inman Flash Replacement (sIFR) would not be indexed. Things have changed since then; in 2008 Adobe announced that they have provided Google and Yahoo! optimized version of its Flash Player technology to help enhance search engine indexing of Flash content.

Later that year Google announced the launch of their Flash indexing algorithm however it advised that the new algorithm’s capabilities are limited. In November 2010 Google announced that their Flash indexing algorithm has been significantly improved and that it is capable of indexing almost any Flash content.

Currently almost any text a user can see as they interact with a SWF file on your site can be indexed by Googlebot and used to generate a snippet or match query terms in Google searches. Additionally, Googlebot can also discover URLs in SWF files and follow those links, so if your SWF content contains links to pages inside your website, Google may be able to crawl and index those pages as well.

If your project revolves around Flash technology and SEO is an important element of your project (which it should be anyway) then it is worth looking into GAIA Framework. Depending on level of interest I might create a tutorial on GAIA Framework and Flash SEO at some point next year.

As far as images are concerned, Google has used Optical Character Recognition technology for over 2 years now. OCR technology allows Google to convert a picture of “a thousand words, into a thousand words – words that can be searched and indexed”.