uk.co.weft.maybeupload
Class MultipartHandler

java.lang.Object
  |
  +--uk.co.weft.maybeupload.MultipartHandler

public class MultipartHandler
extends java.lang.Object

A handler for multipart-form-data data per RFC 1867. One of the trickier elements of RFC 1867 is that multipart/mixed elements may be embedded inside multipart/form-data. The RFC does not say that multipart elements may not be arbitrarily nested. While I don't know whether any clients do nest multipart elements, it would be nice to be fully RFC 1867 compliant...

This class knows about content-type-encodings but does not yet decode encoded types. This is something I intend to fix in a later release.

Objects of this class are not intended to be reusable. Use once and throw away.

Version:
$Revision: 1.18 $ This revision: $Author: simon $
  $Log: MultipartHandler.java,v $
  Revision 1.18  2001/07/17 12:59:03  simon
  [stupid] I had missed java.io.PushbackInputStream, and reinvented the
  wheel. Fixed. Thanks, Randy, for pointing this out.

  Revision 1.17  2001/07/17 12:27:09  simon
  Rewrote handleInlinePart to use former readFilePart (now renamed to
  readPartData) in order to attempt to address Samuel ARNOD-PRIN's Mac
  upload bug; incorporated Randy Chang's latest bugfix to the core loop
  in readPartData.

  Revision 1.16  2001/07/13 12:56:36  simon
  Tidied up Thomas Wilson's disallowedCharacters stuff; merged Randy
  Chang's binary upload improvements; had another bash at the Mac upload
  problems.

  Revision 1.15  2001/06/26 09:39:07  thomas
  Protected HashTable attribute mapping Characters disallowed in a filename to characters which they should be replaced with. Defaults to ' ' -> '_'.

  Revision 1.14  2001/06/25 16:03:29  simon
  *** empty log message ***

  Revision 1.13  2001/06/25 15:55:14  simon
  Rewrote readFilePart() as a state transition engine; seems a lot
  cleaner, seems to work better. Still have doubts about efficiency.

  Revision 1.12  2001/06/21 08:52:00  simon
  Essentially just confirming Thomas' changes to MultipartHandler, and adding
  in an additional DEBUG message of my own...

  Revision 1.11 2001/06/13 10:04:30 thomas
  No longer use read < expected to control exit from the
  various loops which read data from the input stream. This method proved 
  to introduce problems and can be avoided by reading from the stream until 
  we reach the end and not when the count reaches the expected count. The 
  bytes expected is retained to provide a guide to the amount of data read 
  which may prove useful for debugging at some later date.

  Revision 1.10  2001/04/24 15:55:58  simon
  Patch release incorporating Aaron Dunlop's ByteArrayInputStream stuff.

  Revision 1.5  2001/04/11 20:06:56  aarond
  Added check for 0-length uploads (e.g., the user did not specify a file to upload)

  Revision 1.4  2001/04/11 00:29:02  aarond
  Upgraded maybeupload package to 1.0.2pre3 and patched with our changes

  Revision 1.9  2001/03/22 10:48:32  simon
  Bugfixes including a nasty one where if the last parameter in an input
  stream was inline, it got truncated by one byte.

  Revision 1.8  2001/02/22 11:04:07  simon
  Applied patch supplied by Juho Snellman to fix a problem in
  readFilePart. An alterantive patch for the same problem was supplied
  by Cor Hofman. Grateful thanks to both.

  Revision 1.7  2001/01/23 19:12:17  simon
  A number of bugfixes, plus an important new feature: you can decide
  whether to allow name collisions in the upload directory to result
  in overwriting, renaming of the new file, or an exception.

  Revision 1.6  2001/01/22 15:56:39  simon
  'nother little horrible... tokens were being returned as '"token"',
  rather than 'token'. Fixed.

  Revision 1.5  2001/01/22 15:08:21  simon
  More bugs, unfortunately. Was double-counting some characters read;
  once that was sorted, found that I was reading off the end of input.
  Now appears fixed even for complex forms... touch wood.

  Revision 1.4  2001/01/09 12:45:47  simon
  Fixed the 'won't read past a null value' bug. Last known
  bug...

  Revision 1.3  2001/01/09 12:14:12  simon
  Now tested with:
  	Netscape Communicator 4.76/Linux 2.2
  	Konqueror 1.9.8/Linux 2.2
  	Microsoft Internet Explorer 5.00.2014.0216IC
  File upload (including binary file upload) works. Remaining known bug:
  all fields must have data...

  Revision 1.2  2001/01/08 12:39:09  simon
  Now working. Hooray! [that was *hard*]

  Revision 1.1.1.1  2001/01/05 14:58:09  simon
  First cut - not yet tested


  
Author:
Simon Brooke (simon@jasmine.org.uk)

Field Summary
protected  int anon
          a counter to use to name anonymous part values (should never be needed)
protected  char CTE_7BIT
          content-transfer-encoding types, as mandated by RFC 1521, section 5.
protected  char CTE_8BIT
           
protected  char CTE_BASE64
           
protected  char CTE_BINARY
           
protected  char CTE_QUOTED_PRINTABLE
           
protected  char CTE_XTOKEN
           
protected  boolean DEBUG
          whether to print debugging output.
protected  java.util.Hashtable disallowedCharacters
          Disallowed characters in filenames.
protected  java.util.Hashtable values
          the name-value pairs I have identified
 
Constructor Summary
(package private) MultipartHandler(java.util.Hashtable values, java.io.InputStream in, int length, java.lang.String cthdr, java.io.File workdir)
          read multiple values from this RFC 1867 formatted input stream into this hashtable
(package private) MultipartHandler(java.util.Hashtable values, java.io.InputStream in, int length, java.lang.String cthdr, java.io.File workdir, boolean saveUploadedFilesToDisk, boolean allowOverwrite, boolean silentlyRename)
          read multiple values from this RFC 1867 formatted input stream into this hashtable
 
Method Summary
 void disallow(char disallowed, char preferred)
          mark the specified character as diallowed in filenames, and replace it if found with the specified replacement
 void disallow(java.lang.String disallowed, char preferred)
          mark the specified characters as diallowed in filenames, and replace it if found with the specified replacement
protected  java.lang.String handleFilePart(java.util.Hashtable headers, char cte, java.lang.String boundary)
          read a value from the input stream up to the next boundary, save it to a file in my workdir whose name is the value of the 'filename' header in these headers, and cache a File object describing it in my values on the name which is the value of the "name" header in these headers
protected  java.lang.String handleInlinePart(java.util.Hashtable headers, char cte, java.lang.String boundary)
          read a value from the input stream up to the next boundary, and cache it in my values on the name which is the value of the "name" header in these headers
protected  java.lang.String handlePart(java.lang.String line, java.lang.String boundary)
          handle a single part of a multipart file, starting with this line which has already been read in
protected  void put(java.lang.String name, java.lang.Object value)
          within the name/value stream a name may have multiple values.
 int readLine(byte[] b, int off, int len, char cte)
          read a line up to and including a CR/LF line end from my InputStream into this buffer.
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

values

protected java.util.Hashtable values
the name-value pairs I have identified

anon

protected int anon
a counter to use to name anonymous part values (should never be needed)

DEBUG

protected final boolean DEBUG
whether to print debugging output. Compile-time option only; do not set true for deliverable code!

CTE_7BIT

protected final char CTE_7BIT
content-transfer-encoding types, as mandated by RFC 1521, section 5. X-tokens will not be handled because to quote the RFC 'the creation of new Content-Transfer-Encoding values is explicitly and strongly discouraged'

CTE_QUOTED_PRINTABLE

protected final char CTE_QUOTED_PRINTABLE

CTE_BASE64

protected final char CTE_BASE64

CTE_8BIT

protected final char CTE_8BIT

CTE_BINARY

protected final char CTE_BINARY

CTE_XTOKEN

protected final char CTE_XTOKEN

disallowedCharacters

protected java.util.Hashtable disallowedCharacters
Disallowed characters in filenames. Hashtable of Characters.
Constructor Detail

MultipartHandler

MultipartHandler(java.util.Hashtable values,
                 java.io.InputStream in,
                 int length,
                 java.lang.String cthdr,
                 java.io.File workdir)
           throws java.io.IOException,
                  UploadException
read multiple values from this RFC 1867 formatted input stream into this hashtable
Parameters:
values - a hashtable to populate with the values read
in - an input stream, assumed to be RFC 1867 formatted
cthdr - the content-type header which identifies this stream as multipart
workdir - a directory in which to save uploaded files

MultipartHandler

MultipartHandler(java.util.Hashtable values,
                 java.io.InputStream in,
                 int length,
                 java.lang.String cthdr,
                 java.io.File workdir,
                 boolean saveUploadedFilesToDisk,
                 boolean allowOverwrite,
                 boolean silentlyRename)
           throws java.io.IOException,
                  UploadException
read multiple values from this RFC 1867 formatted input stream into this hashtable
Parameters:
values - a hashtable to populate with the values read
in - an input stream, assumed to be RFC 1867 formatted
cthdr - the content-type header which identifies this stream as multipart
workdir - a directory in which to save uploaded files
Method Detail

disallow

public void disallow(char disallowed,
                     char preferred)
mark the specified character as diallowed in filenames, and replace it if found with the specified replacement
Parameters:
disallowed - the character we disallow
preferred - the character to replace it with

disallow

public void disallow(java.lang.String disallowed,
                     char preferred)
mark the specified characters as diallowed in filenames, and replace it if found with the specified replacement
Parameters:
disallowed - a String comprising the characters we disallow
preferred - the character to replace it with

handlePart

protected java.lang.String handlePart(java.lang.String line,
                                      java.lang.String boundary)
                               throws java.io.IOException,
                                      UploadException
handle a single part of a multipart file, starting with this line which has already been read in

handleInlinePart

protected java.lang.String handleInlinePart(java.util.Hashtable headers,
                                            char cte,
                                            java.lang.String boundary)
                                     throws java.io.IOException
read a value from the input stream up to the next boundary, and cache it in my values on the name which is the value of the "name" header in these headers
Parameters:
headers - a hash of the headers of the current part
cte - the content-transfer-encoding of the current part
boundary - the boundary of the current part
Returns:
the last line read- which should be the boundary...

handleFilePart

protected java.lang.String handleFilePart(java.util.Hashtable headers,
                                          char cte,
                                          java.lang.String boundary)
                                   throws java.io.IOException,
                                          UploadException
read a value from the input stream up to the next boundary, save it to a file in my workdir whose name is the value of the 'filename' header in these headers, and cache a File object describing it in my values on the name which is the value of the "name" header in these headers
Parameters:
headers - a hash of the headers of the current part
cte - the content-transfer-encoding of the current part
boundary - the boundary of the current part
Returns:
the last line read-which should be the boundary...

put

protected void put(java.lang.String name,
                   java.lang.Object value)
within the name/value stream a name may have multiple values. If the name has just one value I want to store just the object because that makes life simpler; however if I find a second value I want to convert it to a vector of values
Parameters:
name - the key to store against
value - the String or File to store

readLine

public int readLine(byte[] b,
                    int off,
                    int len,
                    char cte)
             throws java.io.IOException
read a line up to and including a CR/LF line end from my InputStream into this buffer.
Parameters:
b - a byte array to read into
off - the offset in the buffer at which to start
len - the maximum number of bytes to read
cte - the content-type-encoding used in the current part. Currently not used. A later version of this method may read-and-decode
Returns:
the number of bytes read