    I'm being asked to estimate the time required to create a search engine that would check the content of PDFs. Does the code you used here work with PDFs? On a more "uber" level, are PDFs read by web browsers as having text, or as one monolithic object like an image?

    I wouldn&#039;t advise trying to write your own search engine, regardless of what file types you&#039;re targeting.<BR><BR>PDF files are actually text based (I think), using a markup language based on PostScript. Parsing is not easy, but it&#039;s the indexing / caching / efficiency that&#039;s the real problem.<BR><BR>However, there is a super easy solution - there&#039;s a PDF filter for Index Server (or Indexing Service on Win2K/XP), available as a free download from Adobe. Just install that and Index Server works with PDF files - a 5 minute solution to a multi-man-month problem...<BR><BR>Dunc

