Does it work with PDFs

Results 1 to 2 of 2

Thread: Does it work with PDFs

  1. #1
    Join Date
    Dec 1969

    Default Does it work with PDFs

    I'm being asked to estimate the time required to create a search engine that would check the content of PDFs. Does the code you used here work with PDFs? On a more "uber" level, are PDFs read by web browsers as having text, or as one monolithic object like an image?

  2. #2
    Join Date
    Dec 1969

    Default RE: Does it work with PDFs

    I wouldn&#039;t advise trying to write your own search engine, regardless of what file types you&#039;re targeting.<BR><BR>PDF files are actually text based (I think), using a markup language based on PostScript. Parsing is not easy, but it&#039;s the indexing / caching / efficiency that&#039;s the real problem.<BR><BR>However, there is a super easy solution - there&#039;s a PDF filter for Index Server (or Indexing Service on Win2K/XP), available as a free download from Adobe. Just install that and Index Server works with PDF files - a 5 minute solution to a multi-man-month problem...<BR><BR>Dunc

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts