org.apache.commons.codec.language
Class Soundex

java.lang.Object
  extended byorg.apache.commons.codec.language.Soundex
All Implemented Interfaces:
Encoder, StringEncoder

public class Soundex
extends Object
implements StringEncoder

Encodes a string into a soundex value. Soundex is an encoding used to relate similar names, but can also be used as a general purpose scheme to find word with similar phonemes.

Version:
$Id: Soundex.java,v 1.13 2003/11/12 19:02:57 ggregory Exp $
Author:
bayard@generationjava.com, Tim O'Brien, Gary Gregory

Field Summary
private  int maxLength
          Deprecated. This feature is not needed since the encoding size must be constant.
private  char[] soundexMapping
          Every letter of the alphabet is "mapped" to a numerical value.
static Soundex US_ENGLISH
          This static variable contains an instance of the Soundex using the US_ENGLISH mapping.
static char[] US_ENGLISH_MAPPING
          This is a default mapping of the 26 letters used in US english.
 
Constructor Summary
Soundex()
          Creates an instance of the Soundex object using the default US_ENGLISH mapping.
Soundex(char[] mapping)
          Creates a soundex instance using a custom mapping.
 
Method Summary
private  String clean(String str)
          Cleans up the input string before Soundex processing by only returning upper case letters.
 Object encode(Object pObject)
          Encodes an Object using the soundex algorithm.
 String encode(String pString)
          Encodes a String using the soundex algorithm.
private  char getMappingCode(String str, int index)
          Used internally by the SoundEx algorithm.
 int getMaxLength()
          Deprecated. This feature is not needed since the encoding size must be constant.
private  char[] getSoundexMapping()
           
private  char map(char c)
          Maps the given upper-case character to it's Soudex code.
 void setMaxLength(int maxLength)
          Deprecated. This feature is not needed since the encoding size must be constant.
private  void setSoundexMapping(char[] soundexMapping)
           
 String soundex(String str)
          Retreives the Soundex code for a given String object.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

US_ENGLISH

public static final Soundex US_ENGLISH
This static variable contains an instance of the Soundex using the US_ENGLISH mapping.


US_ENGLISH_MAPPING

public static final char[] US_ENGLISH_MAPPING
This is a default mapping of the 26 letters used in US english. A value of 0 for a letter position means do not encode.


maxLength

private int maxLength
Deprecated. This feature is not needed since the encoding size must be constant.

The maximum length of a Soundex code - Soundex codes are only four characters by definition.


soundexMapping

private char[] soundexMapping
Every letter of the alphabet is "mapped" to a numerical value. This char array holds the values to which each letter is mapped. This implementation contains a default map for US_ENGLISH

Constructor Detail

Soundex

public Soundex()
Creates an instance of the Soundex object using the default US_ENGLISH mapping.


Soundex

public Soundex(char[] mapping)
Creates a soundex instance using a custom mapping. This constructor can be used to customize the mapping, and/or possibly provide an internationalized mapping for a non-Western character set.

Parameters:
mapping - Mapping array to use when finding the corresponding code for a given character
Method Detail

clean

private String clean(String str)
Cleans up the input string before Soundex processing by only returning upper case letters.


encode

public Object encode(Object pObject)
              throws EncoderException
Encodes an Object using the soundex algorithm. This method is provided in order to satisfy the requirements of the Encoder interface, and will throw an EncoderException if the supplied object is not of type java.lang.String.

Specified by:
encode in interface Encoder
Parameters:
pObject - Object to encode
Returns:
An object (or type java.lang.String) containing the soundex code which corresponds to the String supplied.
Throws:
EncoderException - if the parameter supplied is not of type java.lang.String

encode

public String encode(String pString)
Encodes a String using the soundex algorithm.

Specified by:
encode in interface StringEncoder
Parameters:
pString - A String object to encode
Returns:
A Soundex code corresponding to the String supplied

getMappingCode

private char getMappingCode(String str,
                            int index)
Used internally by the SoundEx algorithm. Consonants from the same code group separated by W or H are treated as one.

Parameters:
str - the cleaned working string to encode (in upper case).
index - the character position to encode
Returns:
Mapping code for a particular character

getMaxLength

public int getMaxLength()
Deprecated. This feature is not needed since the encoding size must be constant.

Returns the maxLength. Standard Soundex

Returns:
int

getSoundexMapping

private char[] getSoundexMapping()
Returns:
Returns the soundexMapping.

map

private char map(char c)
Maps the given upper-case character to it's Soudex code.


setMaxLength

public void setMaxLength(int maxLength)
Deprecated. This feature is not needed since the encoding size must be constant.

Sets the maxLength.

Parameters:
maxLength - The maxLength to set

setSoundexMapping

private void setSoundexMapping(char[] soundexMapping)
Parameters:
soundexMapping - The soundexMapping to set.

soundex

public String soundex(String str)
Retreives the Soundex code for a given String object.

Parameters:
str - String to encode using the Soundex algorithm
Returns:
A soundex code for the String supplied


${component.name} version 1.2 - Copyright © 2003 - Apache Software Foundation