Mime

Name

Mime -- defines external parser for given mime-type

indexer.conf

Synopsis

Mime {from_mime} {to_mime} {command line}

Description

This is used to add support for parsing documents with mime types other than text/plain and text/html. It can be done via external parser (which must provide output in plain or html text) or just by substituting mime type so indexer will understand it.

from_mime and to_mime are standard mime types to_mime is either text/plain or text/html

Optional charset parameter used to change charset if needed

Command line may have $1 parameter which stands for temporary file name. Some parsers can not operate on stdin, so indexer creates temporary file for parser and it's name passed instead of $1. Take a look into documentation for other parser types and parsers usage explanation.

Examples


Mime application/msword      "text/plain; charset=cp1251"  "catdoc $1"
Mime application/x-troff-man  text/plain                    "deroff"
Mime text/x-postscript        text/plain                    "ps2ascii"
Mime application/pdf          text/plain                    "pdftotext $1 -"
Mime application/vnd.ms-excel text/plain                    "xls2csv $1"
Mime "text/rtf*"              text/html                     "rthc --use-stdout $1 2>/dev/null"