Class URI

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Cloneable, java.lang.Comparable
    Direct Known Subclasses:
    HttpURL, URIUtil.Coder

    public class URI
    extends java.lang.Object
    implements java.lang.Cloneable, java.lang.Comparable, java.io.Serializable
    The interface for the URI(Uniform Resource Identifiers) version of RFC 2396. This class has the purpose of supportting of parsing a URI reference to extend any specific protocols, the character encoding of the protocol to be transported and the charset of the document.

    A URI is always in an "escaped" form, since escaping or unescaping a completed URI might change its semantics.

    Implementers should be careful not to escape or unescape the same string more than once, since unescaping an already unescaped string might lead to misinterpreting a percent data character as another escaped character, or vice versa in the case of escaping an already escaped string.

    In order to avoid these problems, data types used as follows:

       URI character sequence: char
       octet sequence: byte
       original character sequence: String
     

    So, a URI is a sequence of characters as an array of a char type, which is not always represented as a sequence of octets as an array of byte.

    URI Syntactic Components

     - In general, written as follows:
       Absolute URI = <scheme>:<scheme-specific-part>
       Generic URI = <scheme>://<authority><path>?<query>
    
     - Syntax
       absoluteURI   = scheme ":" ( hier_part | opaque_part )
       hier_part     = ( net_path | abs_path ) [ "?" query ]
       net_path      = "//" authority [ abs_path ]
       abs_path      = "/"  path_segments
     

    The following examples illustrate URI that are in common use.

     ftp://ftp.is.co.za/rfc/rfc1808.txt
        -- ftp scheme for File Transfer Protocol services
     gopher://spinaltap.micro.umn.edu/00/Weather/California/Los%20Angeles
        -- gopher scheme for Gopher and Gopher+ Protocol services
     http://www.math.uio.no/faq/compression-faq/part1.html
        -- http scheme for Hypertext Transfer Protocol services
     mailto:mduerst@ifi.unizh.ch
        -- mailto scheme for electronic mail addresses
     news:comp.infosystems.www.servers.unix
        -- news scheme for USENET news groups and articles
     telnet://melvyl.ucop.edu/
        -- telnet scheme for interactive services via the TELNET Protocol
     
    Please, notice that there are many modifications from URL(RFC 1738) and relative URL(RFC 1808).

    The expressions for a URI

     For escaped URI forms
      - URI(char[]) // constructor
      - char[] getRawXxx() // method
      - String getEscapedXxx() // method
      - String toString() // method
     

    For unescaped URI forms - URI(String) // constructor - String getXXX() // method

    Version:
    $Revision: 564973 $ $Date: 2002/03/14 15:14:01
    Author:
    Sung-Gu, Mike Bowler
    See Also:
    Serialized Form
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  URI.DefaultCharsetChanged
      The charset-changed normal operation to represent to be required to alert to user the fact the default charset is changed.
      static class  URI.LocaleToCharsetMap
      A mapping to determine the (somewhat arbitrarily) preferred charset for a given locale.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected char[] _authority
      The authority.
      protected char[] _fragment
      The fragment.
      protected char[] _host
      The host.
      protected boolean _is_abs_path  
      protected boolean _is_hier_part  
      protected boolean _is_hostname  
      protected boolean _is_IPv4address  
      protected boolean _is_IPv6reference  
      protected boolean _is_net_path  
      protected boolean _is_opaque_part  
      protected boolean _is_reg_name  
      protected boolean _is_rel_path  
      protected boolean _is_server  
      protected char[] _opaque
      The opaque.
      protected char[] _path
      The path.
      protected int _port
      The port.
      protected char[] _query
      The query.
      protected char[] _scheme
      The scheme.
      protected char[] _uri
      This Uniform Resource Identifier (URI).
      protected char[] _userinfo
      The userinfo.
      protected static java.util.BitSet abs_path
      URI absolute path.
      protected static java.util.BitSet absoluteURI
      BitSet for absoluteURI.
      static java.util.BitSet allowed_abs_path
      Those characters that are allowed for the abs_path.
      static java.util.BitSet allowed_authority
      Those characters that are allowed for the authority component.
      static java.util.BitSet allowed_fragment
      Those characters that are allowed for the fragment component.
      static java.util.BitSet allowed_host
      Those characters that are allowed for the host component.
      static java.util.BitSet allowed_IPv6reference
      Those characters that are allowed for the IPv6reference component.
      static java.util.BitSet allowed_opaque_part
      Those characters that are allowed for the opaque_part.
      static java.util.BitSet allowed_query
      Those characters that are allowed for the query component.
      static java.util.BitSet allowed_reg_name
      Those characters that are allowed for the reg_name.
      static java.util.BitSet allowed_rel_path
      Those characters that are allowed for the rel_path.
      static java.util.BitSet allowed_userinfo
      Those characters that are allowed for the userinfo component.
      static java.util.BitSet allowed_within_authority
      Those characters that are allowed for the authority component.
      static java.util.BitSet allowed_within_path
      Those characters that are allowed within the path.
      static java.util.BitSet allowed_within_query
      Those characters that are allowed within the query component.
      static java.util.BitSet allowed_within_userinfo
      Those characters that are allowed for within the userinfo component.
      protected static java.util.BitSet alpha
      BitSet for alpha.
      protected static java.util.BitSet alphanum
      BitSet for alphanum (join of alpha & digit).
      protected static java.util.BitSet authority
      BitSet for authority.
      static java.util.BitSet control
      BitSet for control.
      protected static java.lang.String defaultDocumentCharset
      The default charset of the document.
      protected static java.lang.String defaultDocumentCharsetByLocale  
      protected static java.lang.String defaultDocumentCharsetByPlatform  
      protected static java.lang.String defaultProtocolCharset
      The default charset of the protocol.
      static java.util.BitSet delims
      BitSet for delims.
      protected static java.util.BitSet digit
      BitSet for digit.
      static java.util.BitSet disallowed_opaque_part
      Disallowed opaque_part before escaping.
      static java.util.BitSet disallowed_rel_path
      Disallowed rel_path before escaping.
      protected static java.util.BitSet domainlabel
      BitSet for domainlabel.
      protected static java.util.BitSet escaped
      BitSet for escaped.
      protected static java.util.BitSet fragment
      BitSet for fragment (alias for uric).
      protected int hash
      Cache the hash code for this URI.
      protected static java.util.BitSet hex
      BitSet for hex.
      protected static java.util.BitSet hier_part
      BitSet for hier_part.
      protected static java.util.BitSet host
      BitSet for host.
      protected static java.util.BitSet hostname
      BitSet for hostname.
      protected static java.util.BitSet hostport
      BitSet for hostport.
      protected static java.util.BitSet IPv4address
      Bitset that combines digit and dot fo IPv$address.
      protected static java.util.BitSet IPv6address
      RFC 2373.
      protected static java.util.BitSet IPv6reference
      RFC 2732, 2373.
      protected static java.util.BitSet mark
      BitSet for mark.
      protected static java.util.BitSet net_path
      BitSet for net_path.
      protected static java.util.BitSet opaque_part
      URI bitset that combines uric_no_slash and uric.
      protected static java.util.BitSet param
      BitSet for param (alias for pchar).
      protected static java.util.BitSet path
      URI bitset that combines absolute path and opaque part.
      protected static java.util.BitSet path_segments
      BitSet for path segments.
      protected static java.util.BitSet pchar
      BitSet for pchar.
      protected static java.util.BitSet percent
      The percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI.
      protected static java.util.BitSet port
      Port, a logical alias for digit.
      protected java.lang.String protocolCharset
      The charset of the protocol used by this URI instance.
      protected static java.util.BitSet query
      BitSet for query (alias for uric).
      protected static java.util.BitSet reg_name
      BitSet for reg_name.
      protected static java.util.BitSet rel_path
      BitSet for rel_path.
      protected static java.util.BitSet rel_segment
      BitSet for rel_segment.
      protected static java.util.BitSet relativeURI
      BitSet for relativeURI.
      protected static java.util.BitSet reserved
      BitSet for reserved.
      protected static char[] rootPath
      The root path.
      protected static java.util.BitSet scheme
      BitSet for scheme.
      protected static java.util.BitSet segment
      BitSet for segment.
      protected static java.util.BitSet server
      Bitset for server.
      static java.util.BitSet space
      BitSet for space.
      protected static java.util.BitSet toplabel
      BitSet for toplabel.
      protected static java.util.BitSet unreserved
      Data characters that are allowed in a URI but do not have a reserved purpose are called unreserved.
      static java.util.BitSet unwise
      BitSet for unwise.
      protected static java.util.BitSet URI_reference
      BitSet for URI-reference.
      protected static java.util.BitSet uric
      BitSet for uric.
      protected static java.util.BitSet uric_no_slash
      URI bitset for encoding typical non-slash characters.
      protected static java.util.BitSet userinfo
      Bitset for userinfo.
      static java.util.BitSet within_userinfo
      BitSet for within the userinfo component like user and password.
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected URI()
      Create an instance as an internal use
        URI​(char[] escaped)
      Deprecated.
      Use #URI(String, boolean)
        URI​(char[] escaped, java.lang.String charset)
      Deprecated.
      Use #URI(String, boolean, String)
        URI​(java.lang.String original)
      Deprecated.
      Use #URI(String, boolean)
        URI​(java.lang.String s, boolean escaped)
      Construct a URI from a string with the given charset.
        URI​(java.lang.String s, boolean escaped, java.lang.String charset)
      Construct a URI from a string with the given charset.
        URI​(java.lang.String original, java.lang.String charset)
      Deprecated.
      Use #URI(String, boolean, String)
        URI​(java.lang.String scheme, java.lang.String schemeSpecificPart, java.lang.String fragment)
      Construct a general URI from the given components.
        URI​(java.lang.String scheme, java.lang.String userinfo, java.lang.String host, int port)
      Construct a general URI from the given components.
        URI​(java.lang.String scheme, java.lang.String userinfo, java.lang.String host, int port, java.lang.String path)
      Construct a general URI from the given components.
        URI​(java.lang.String scheme, java.lang.String userinfo, java.lang.String host, int port, java.lang.String path, java.lang.String query)
      Construct a general URI from the given components.
        URI​(java.lang.String scheme, java.lang.String userinfo, java.lang.String host, int port, java.lang.String path, java.lang.String query, java.lang.String fragment)
      Construct a general URI from the given components.
        URI​(java.lang.String scheme, java.lang.String host, java.lang.String path, java.lang.String fragment)
      Construct a general URI from the given components.
        URI​(java.lang.String scheme, java.lang.String authority, java.lang.String path, java.lang.String query, java.lang.String fragment)
      Construct a general URI from the given components.
        URI​(URI base, java.lang.String relative)
      Deprecated.
      Use #URI(URI, String, boolean)
        URI​(URI base, java.lang.String relative, boolean escaped)
      Construct a general URI with the given relative URI string.
        URI​(URI base, URI relative)
      Construct a general URI with the given relative URI.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.lang.Object clone()
      Create and return a copy of this object, the URI-reference containing the userinfo component.
      int compareTo​(java.lang.Object obj)
      Compare this URI to another object.
      protected static java.lang.String decode​(char[] component, java.lang.String charset)
      Decodes URI encoded string.
      protected static java.lang.String decode​(java.lang.String component, java.lang.String charset)
      Decodes URI encoded string.
      protected static char[] encode​(java.lang.String original, java.util.BitSet allowed, java.lang.String charset)
      Encodes URI string.
      protected boolean equals​(char[] first, char[] second)
      Test if the first array is equal to the second array.
      boolean equals​(java.lang.Object obj)
      Test an object if this URI is equal to another.
      java.lang.String getAboveHierPath()
      Get the level above the this hierarchy level.
      java.lang.String getAuthority()
      Get the authority.
      java.lang.String getCurrentHierPath()
      Get the current hierarchy level.
      static java.lang.String getDefaultDocumentCharset()
      Get the recommended default charset of the document.
      static java.lang.String getDefaultDocumentCharsetByLocale()
      Get the default charset of the document by locale.
      static java.lang.String getDefaultDocumentCharsetByPlatform()
      Get the default charset of the document by platform.
      static java.lang.String getDefaultProtocolCharset()
      Get the default charset of the protocol.
      java.lang.String getEscapedAboveHierPath()
      Get the level above the this hierarchy level.
      java.lang.String getEscapedAuthority()
      Get the escaped authority.
      java.lang.String getEscapedCurrentHierPath()
      Get the escaped current hierarchy level.
      java.lang.String getEscapedFragment()
      Get the escaped fragment.
      java.lang.String getEscapedName()
      Get the escaped basename of the path.
      java.lang.String getEscapedPath()
      Get the escaped path.
      java.lang.String getEscapedPathQuery()
      Get the escaped query.
      java.lang.String getEscapedQuery()
      Get the escaped query.
      java.lang.String getEscapedURI()
      It can be gotten the URI character sequence.
      java.lang.String getEscapedURIReference()
      Get the escaped URI reference string.
      java.lang.String getEscapedUserinfo()
      Get the escaped userinfo.
      java.lang.String getFragment()
      Get the fragment.
      java.lang.String getHost()
      Get the host.
      java.lang.String getName()
      Get the basename of the path.
      java.lang.String getPath()
      Get the path.
      java.lang.String getPathQuery()
      Get the path and query.
      int getPort()
      Get the port.
      java.lang.String getProtocolCharset()
      Get the protocol charset used by this current URI instance.
      java.lang.String getQuery()
      Get the query.
      char[] getRawAboveHierPath()
      Get the level above the this hierarchy level.
      char[] getRawAuthority()
      Get the raw-escaped authority.
      char[] getRawCurrentHierPath()
      Get the raw-escaped current hierarchy level.
      protected char[] getRawCurrentHierPath​(char[] path)
      Get the raw-escaped current hierarchy level in the given path.
      char[] getRawFragment()
      Get the raw-escaped fragment.
      char[] getRawHost()
      Get the host.
      char[] getRawName()
      Get the raw-escaped basename of the path.
      char[] getRawPath()
      Get the raw-escaped path.
      char[] getRawPathQuery()
      Get the raw-escaped path and query.
      char[] getRawQuery()
      Get the raw-escaped query.
      char[] getRawScheme()
      Get the scheme.
      char[] getRawURI()
      It can be gotten the URI character sequence.
      char[] getRawURIReference()
      Get the URI reference character sequence.
      char[] getRawUserinfo()
      Get the raw-escaped userinfo.
      java.lang.String getScheme()
      Get the scheme.
      java.lang.String getURI()
      It can be gotten the URI character sequence.
      java.lang.String getURIReference()
      Get the original URI reference string.
      java.lang.String getUserinfo()
      Get the userinfo.
      boolean hasAuthority()
      Tell whether or not this URI has authority.
      boolean hasFragment()
      Tell whether or not this URI has fragment.
      int hashCode()
      Return a hash code for this URI.
      boolean hasQuery()
      Tell whether or not this URI has query.
      boolean hasUserinfo()
      Tell whether or not this URI has userinfo.
      protected int indexFirstOf​(char[] s, char delim)
      Get the earlier index that to be searched for the first occurrance in one of any of the given array.
      protected int indexFirstOf​(char[] s, char delim, int offset)
      Get the earlier index that to be searched for the first occurrance in one of any of the given array.
      protected int indexFirstOf​(java.lang.String s, java.lang.String delims)
      Get the earlier index that to be searched for the first occurrance in one of any of the given string.
      protected int indexFirstOf​(java.lang.String s, java.lang.String delims, int offset)
      Get the earlier index that to be searched for the first occurrance in one of any of the given string.
      boolean isAbsoluteURI()
      Tell whether or not this URI is absolute.
      boolean isAbsPath()
      Tell whether or not the relativeURI or hier_part of this URI is abs_path.
      boolean isHierPart()
      Tell whether or not the absoluteURI of this URI is hier_part.
      boolean isHostname()
      Tell whether or not the host part of this URI is hostname.
      boolean isIPv4address()
      Tell whether or not the host part of this URI is IPv4address.
      boolean isIPv6reference()
      Tell whether or not the host part of this URI is IPv6reference.
      boolean isNetPath()
      Tell whether or not the relativeURI or heir_part of this URI is net_path.
      boolean isOpaquePart()
      Tell whether or not the absoluteURI of this URI is opaque_part.
      boolean isRegName()
      Tell whether or not the authority component of this URI is reg_name.
      boolean isRelativeURI()
      Tell whether or not this URI is relative.
      boolean isRelPath()
      Tell whether or not the relativeURI of this URI is rel_path.
      boolean isServer()
      Tell whether or not the authority component of this URI is server.
      void normalize()
      Normalizes the path part of this URI.
      protected char[] normalize​(char[] path)
      Normalize the given hier path part.
      protected void parseAuthority​(java.lang.String original, boolean escaped)
      Parse the authority component.
      protected void parseUriReference​(java.lang.String original, boolean escaped)
      In order to avoid any possilbity of conflict with non-ASCII characters, Parse a URI reference as a String with the character encoding of the local system or the document.
      protected boolean prevalidate​(java.lang.String component, java.util.BitSet disallowed)
      Pre-validate the unescaped URI string within a specific component.
      protected char[] removeFragmentIdentifier​(char[] component)
      Remove the fragment identifier of the given component.
      protected char[] resolvePath​(char[] basePath, char[] relPath)
      Resolve the base and relative path.
      static void setDefaultDocumentCharset​(java.lang.String charset)
      Set the default charset of the document.
      static void setDefaultProtocolCharset​(java.lang.String charset)
      Set the default charset of the protocol.
      void setEscapedAuthority​(java.lang.String escapedAuthority)
      Set the authority.
      void setEscapedFragment​(java.lang.String escapedFragment)
      Set the escaped fragment string.
      void setEscapedPath​(java.lang.String escapedPath)
      Set the escaped path.
      void setEscapedQuery​(java.lang.String escapedQuery)
      Set the escaped query string.
      void setFragment​(java.lang.String fragment)
      Set the fragment.
      void setPath​(java.lang.String path)
      Set the path.
      void setQuery​(java.lang.String query)
      Set the query.
      void setRawAuthority​(char[] escapedAuthority)
      Set the authority.
      void setRawFragment​(char[] escapedFragment)
      Set the raw-escaped fragment.
      void setRawPath​(char[] escapedPath)
      Set the raw-escaped path.
      void setRawQuery​(char[] escapedQuery)
      Set the raw-escaped query.
      protected void setURI()
      Once it's parsed successfully, set this URI.
      java.lang.String toString()
      Get the escaped URI string.
      protected boolean validate​(char[] component, int soffset, int eoffset, java.util.BitSet generous)
      Validate the URI characters within a specific component.
      protected boolean validate​(char[] component, java.util.BitSet generous)
      Validate the URI characters within a specific component.
      • Methods inherited from class java.lang.Object

        finalize, getClass, notify, notifyAll, wait, wait, wait
    • Field Detail

      • hash

        protected int hash
        Cache the hash code for this URI.
      • _uri

        protected char[] _uri
        This Uniform Resource Identifier (URI). The URI is always in an "escaped" form, since escaping or unescaping a completed URI might change its semantics.
      • protocolCharset

        protected java.lang.String protocolCharset
        The charset of the protocol used by this URI instance.
      • defaultProtocolCharset

        protected static java.lang.String defaultProtocolCharset
        The default charset of the protocol. RFC 2277, 2396
      • defaultDocumentCharset

        protected static java.lang.String defaultDocumentCharset
        The default charset of the document. RFC 2277, 2396 The platform's charset is used for the document by default.
      • defaultDocumentCharsetByLocale

        protected static java.lang.String defaultDocumentCharsetByLocale
      • defaultDocumentCharsetByPlatform

        protected static java.lang.String defaultDocumentCharsetByPlatform
      • _scheme

        protected char[] _scheme
        The scheme.
      • _opaque

        protected char[] _opaque
        The opaque.
      • _authority

        protected char[] _authority
        The authority.
      • _userinfo

        protected char[] _userinfo
        The userinfo.
      • _host

        protected char[] _host
        The host.
      • _port

        protected int _port
        The port.
      • _path

        protected char[] _path
        The path.
      • _query

        protected char[] _query
        The query.
      • _fragment

        protected char[] _fragment
        The fragment.
      • rootPath

        protected static final char[] rootPath
        The root path.
      • percent

        protected static final java.util.BitSet percent
        The percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI.
      • digit

        protected static final java.util.BitSet digit
        BitSet for digit.

         digit    = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
                    "8" | "9"
         

      • alpha

        protected static final java.util.BitSet alpha
        BitSet for alpha.

         alpha         = lowalpha | upalpha
         

      • alphanum

        protected static final java.util.BitSet alphanum
        BitSet for alphanum (join of alpha & digit).

          alphanum      = alpha | digit
         

      • hex

        protected static final java.util.BitSet hex
        BitSet for hex.

         hex           = digit | "A" | "B" | "C" | "D" | "E" | "F" |
                                 "a" | "b" | "c" | "d" | "e" | "f"
         

      • escaped

        protected static final java.util.BitSet escaped
        BitSet for escaped.

         escaped       = "%" hex hex
         

      • mark

        protected static final java.util.BitSet mark
        BitSet for mark.

         mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
                         "(" | ")"
         

      • unreserved

        protected static final java.util.BitSet unreserved
        Data characters that are allowed in a URI but do not have a reserved purpose are called unreserved.

         unreserved    = alphanum | mark
         

      • reserved

        protected static final java.util.BitSet reserved
        BitSet for reserved.

         reserved      = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                         "$" | ","
         

      • uric

        protected static final java.util.BitSet uric
        BitSet for uric.

         uric          = reserved | unreserved | escaped
         

      • fragment

        protected static final java.util.BitSet fragment
        BitSet for fragment (alias for uric).

         fragment      = *uric
         

      • query

        protected static final java.util.BitSet query
        BitSet for query (alias for uric).

         query         = *uric
         

      • pchar

        protected static final java.util.BitSet pchar
        BitSet for pchar.

         pchar         = unreserved | escaped |
                         ":" | "@" | "&" | "=" | "+" | "$" | ","
         

      • param

        protected static final java.util.BitSet param
        BitSet for param (alias for pchar).

         param         = *pchar
         

      • segment

        protected static final java.util.BitSet segment
        BitSet for segment.

         segment       = *pchar *( ";" param )
         

      • path_segments

        protected static final java.util.BitSet path_segments
        BitSet for path segments.

         path_segments = segment *( "/" segment )
         

      • abs_path

        protected static final java.util.BitSet abs_path
        URI absolute path.

         abs_path      = "/"  path_segments
         

      • uric_no_slash

        protected static final java.util.BitSet uric_no_slash
        URI bitset for encoding typical non-slash characters.

         uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
                         "&" | "=" | "+" | "$" | ","
         

      • opaque_part

        protected static final java.util.BitSet opaque_part
        URI bitset that combines uric_no_slash and uric.

         opaque_part   = uric_no_slash *uric
         

      • path

        protected static final java.util.BitSet path
        URI bitset that combines absolute path and opaque part.

         path          = [ abs_path | opaque_part ]
         

      • port

        protected static final java.util.BitSet port
        Port, a logical alias for digit.
      • IPv4address

        protected static final java.util.BitSet IPv4address
        Bitset that combines digit and dot fo IPv$address.

         IPv4address   = 1*digit "." 1*digit "." 1*digit "." 1*digit
         

      • IPv6address

        protected static final java.util.BitSet IPv6address
        RFC 2373.

         IPv6address = hexpart [ ":" IPv4address ]
         

      • IPv6reference

        protected static final java.util.BitSet IPv6reference
        RFC 2732, 2373.

         IPv6reference   = "[" IPv6address "]"
         

      • toplabel

        protected static final java.util.BitSet toplabel
        BitSet for toplabel.

         toplabel      = alpha | alpha *( alphanum | "-" ) alphanum
         

      • domainlabel

        protected static final java.util.BitSet domainlabel
        BitSet for domainlabel.

         domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
         

      • hostname

        protected static final java.util.BitSet hostname
        BitSet for hostname.

         hostname      = *( domainlabel "." ) toplabel [ "." ]
         

      • host

        protected static final java.util.BitSet host
        BitSet for host.

         host          = hostname | IPv4address | IPv6reference
         

      • hostport

        protected static final java.util.BitSet hostport
        BitSet for hostport.

         hostport      = host [ ":" port ]
         

      • userinfo

        protected static final java.util.BitSet userinfo
        Bitset for userinfo.

         userinfo      = *( unreserved | escaped |
                            ";" | ":" | "&" | "=" | "+" | "$" | "," )
         

      • within_userinfo

        public static final java.util.BitSet within_userinfo
        BitSet for within the userinfo component like user and password.
      • server

        protected static final java.util.BitSet server
        Bitset for server.

         server        = [ [ userinfo "@" ] hostport ]
         

      • reg_name

        protected static final java.util.BitSet reg_name
        BitSet for reg_name.

         reg_name      = 1*( unreserved | escaped | "$" | "," |
                             ";" | ":" | "@" | "&" | "=" | "+" )
         

      • authority

        protected static final java.util.BitSet authority
        BitSet for authority.

         authority     = server | reg_name
         

      • scheme

        protected static final java.util.BitSet scheme
        BitSet for scheme.

         scheme        = alpha *( alpha | digit | "+" | "-" | "." )
         

      • rel_segment

        protected static final java.util.BitSet rel_segment
        BitSet for rel_segment.

         rel_segment   = 1*( unreserved | escaped |
                             ";" | "@" | "&" | "=" | "+" | "$" | "," )
         

      • rel_path

        protected static final java.util.BitSet rel_path
        BitSet for rel_path.

         rel_path      = rel_segment [ abs_path ]
         

      • net_path

        protected static final java.util.BitSet net_path
        BitSet for net_path.

         net_path      = "//" authority [ abs_path ]
         

      • hier_part

        protected static final java.util.BitSet hier_part
        BitSet for hier_part.

         hier_part     = ( net_path | abs_path ) [ "?" query ]
         

      • relativeURI

        protected static final java.util.BitSet relativeURI
        BitSet for relativeURI.

         relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]
         

      • absoluteURI

        protected static final java.util.BitSet absoluteURI
        BitSet for absoluteURI.

         absoluteURI   = scheme ":" ( hier_part | opaque_part )
         

      • URI_reference

        protected static final java.util.BitSet URI_reference
        BitSet for URI-reference.

         URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
         

      • control

        public static final java.util.BitSet control
        BitSet for control.
      • space

        public static final java.util.BitSet space
        BitSet for space.
      • delims

        public static final java.util.BitSet delims
        BitSet for delims.
      • unwise

        public static final java.util.BitSet unwise
        BitSet for unwise.
      • disallowed_rel_path

        public static final java.util.BitSet disallowed_rel_path
        Disallowed rel_path before escaping.
      • disallowed_opaque_part

        public static final java.util.BitSet disallowed_opaque_part
        Disallowed opaque_part before escaping.
      • allowed_authority

        public static final java.util.BitSet allowed_authority
        Those characters that are allowed for the authority component.
      • allowed_opaque_part

        public static final java.util.BitSet allowed_opaque_part
        Those characters that are allowed for the opaque_part.
      • allowed_reg_name

        public static final java.util.BitSet allowed_reg_name
        Those characters that are allowed for the reg_name.
      • allowed_userinfo

        public static final java.util.BitSet allowed_userinfo
        Those characters that are allowed for the userinfo component.
      • allowed_within_userinfo

        public static final java.util.BitSet allowed_within_userinfo
        Those characters that are allowed for within the userinfo component.
      • allowed_IPv6reference

        public static final java.util.BitSet allowed_IPv6reference
        Those characters that are allowed for the IPv6reference component. The characters '[', ']' in IPv6reference should be excluded.
      • allowed_host

        public static final java.util.BitSet allowed_host
        Those characters that are allowed for the host component. The characters '[', ']' in IPv6reference should be excluded.
      • allowed_within_authority

        public static final java.util.BitSet allowed_within_authority
        Those characters that are allowed for the authority component.
      • allowed_abs_path

        public static final java.util.BitSet allowed_abs_path
        Those characters that are allowed for the abs_path.
      • allowed_rel_path

        public static final java.util.BitSet allowed_rel_path
        Those characters that are allowed for the rel_path.
      • allowed_within_path

        public static final java.util.BitSet allowed_within_path
        Those characters that are allowed within the path.
      • allowed_query

        public static final java.util.BitSet allowed_query
        Those characters that are allowed for the query component.
      • allowed_within_query

        public static final java.util.BitSet allowed_within_query
        Those characters that are allowed within the query component.
      • allowed_fragment

        public static final java.util.BitSet allowed_fragment
        Those characters that are allowed for the fragment component.
      • _is_hier_part

        protected boolean _is_hier_part
      • _is_opaque_part

        protected boolean _is_opaque_part
      • _is_net_path

        protected boolean _is_net_path
      • _is_abs_path

        protected boolean _is_abs_path
      • _is_rel_path

        protected boolean _is_rel_path
      • _is_reg_name

        protected boolean _is_reg_name
      • _is_server

        protected boolean _is_server
      • _is_hostname

        protected boolean _is_hostname
      • _is_IPv4address

        protected boolean _is_IPv4address
      • _is_IPv6reference

        protected boolean _is_IPv6reference
    • Constructor Detail

      • URI

        protected URI()
        Create an instance as an internal use
      • URI

        public URI​(java.lang.String s,
                   boolean escaped,
                   java.lang.String charset)
            throws URIException,
                   java.lang.NullPointerException
        Construct a URI from a string with the given charset. The input string can be either in escaped or unescaped form.
        Parameters:
        s - URI character sequence
        escaped - true if URI character sequence is in escaped form. false otherwise.
        charset - the charset string to do escape encoding, if required
        Throws:
        URIException - If the URI cannot be created.
        java.lang.NullPointerException - if input string is null
        Since:
        3.0
        See Also:
        getProtocolCharset()
      • URI

        public URI​(java.lang.String s,
                   boolean escaped)
            throws URIException,
                   java.lang.NullPointerException
        Construct a URI from a string with the given charset. The input string can be either in escaped or unescaped form.
        Parameters:
        s - URI character sequence
        escaped - true if URI character sequence is in escaped form. false otherwise.
        Throws:
        URIException - If the URI cannot be created.
        java.lang.NullPointerException - if input string is null
        Since:
        3.0
        See Also:
        getProtocolCharset()
      • URI

        public URI​(char[] escaped,
                   java.lang.String charset)
            throws URIException,
                   java.lang.NullPointerException
        Deprecated.
        Use #URI(String, boolean, String)
        Construct a URI as an escaped form of a character array with the given charset.
        Parameters:
        escaped - the URI character sequence
        charset - the charset string to do escape encoding
        Throws:
        URIException - If the URI cannot be created.
        java.lang.NullPointerException - if escaped is null
        See Also:
        getProtocolCharset()
      • URI

        public URI​(char[] escaped)
            throws URIException,
                   java.lang.NullPointerException
        Deprecated.
        Use #URI(String, boolean)
        Construct a URI as an escaped form of a character array. An URI can be placed within double-quotes or angle brackets like "http://test.com/" and <http://test.com/>
        Parameters:
        escaped - the URI character sequence
        Throws:
        URIException - If the URI cannot be created.
        java.lang.NullPointerException - if escaped is null
        See Also:
        getDefaultProtocolCharset()
      • URI

        public URI​(java.lang.String original,
                   java.lang.String charset)
            throws URIException
        Deprecated.
        Use #URI(String, boolean, String)
        Construct a URI from the given string with the given charset.
        Parameters:
        original - the string to be represented to URI character sequence It is one of absoluteURI and relativeURI.
        charset - the charset string to do escape encoding
        Throws:
        URIException - If the URI cannot be created.
        See Also:
        getProtocolCharset()
      • URI

        public URI​(java.lang.String original)
            throws URIException
        Deprecated.
        Use #URI(String, boolean)
        Construct a URI from the given string.

           URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
         

        An URI can be placed within double-quotes or angle brackets like "http://test.com/" and <http://test.com/>

        Parameters:
        original - the string to be represented to URI character sequence It is one of absoluteURI and relativeURI.
        Throws:
        URIException - If the URI cannot be created.
        See Also:
        getDefaultProtocolCharset()
      • URI

        public URI​(java.lang.String scheme,
                   java.lang.String schemeSpecificPart,
                   java.lang.String fragment)
            throws URIException
        Construct a general URI from the given components.

           URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
           absoluteURI   = scheme ":" ( hier_part | opaque_part )
           opaque_part   = uric_no_slash *uric
         

        It's for absolute URI = <scheme>:<scheme-specific-part># <fragment>.

        Parameters:
        scheme - the scheme string
        schemeSpecificPart - scheme_specific_part
        fragment - the fragment string
        Throws:
        URIException - If the URI cannot be created.
        See Also:
        getDefaultProtocolCharset()
      • URI

        public URI​(java.lang.String scheme,
                   java.lang.String authority,
                   java.lang.String path,
                   java.lang.String query,
                   java.lang.String fragment)
            throws URIException
        Construct a general URI from the given components.

           URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
           absoluteURI   = scheme ":" ( hier_part | opaque_part )
           relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]
           hier_part     = ( net_path | abs_path ) [ "?" query ]
         

        It's for absolute URI = <scheme>:<path>?<query>#< fragment> and relative URI = <path>?<query>#<fragment >.

        Parameters:
        scheme - the scheme string
        authority - the authority string
        path - the path string
        query - the query string
        fragment - the fragment string
        Throws:
        URIException - If the new URI cannot be created.
        See Also:
        getDefaultProtocolCharset()
      • URI

        public URI​(java.lang.String scheme,
                   java.lang.String userinfo,
                   java.lang.String host,
                   int port)
            throws URIException
        Construct a general URI from the given components.
        Parameters:
        scheme - the scheme string
        userinfo - the userinfo string
        host - the host string
        port - the port number
        Throws:
        URIException - If the new URI cannot be created.
        See Also:
        getDefaultProtocolCharset()
      • URI

        public URI​(java.lang.String scheme,
                   java.lang.String userinfo,
                   java.lang.String host,
                   int port,
                   java.lang.String path)
            throws URIException
        Construct a general URI from the given components.
        Parameters:
        scheme - the scheme string
        userinfo - the userinfo string
        host - the host string
        port - the port number
        path - the path string
        Throws:
        URIException - If the new URI cannot be created.
        See Also:
        getDefaultProtocolCharset()
      • URI

        public URI​(java.lang.String scheme,
                   java.lang.String userinfo,
                   java.lang.String host,
                   int port,
                   java.lang.String path,
                   java.lang.String query)
            throws URIException
        Construct a general URI from the given components.
        Parameters:
        scheme - the scheme string
        userinfo - the userinfo string
        host - the host string
        port - the port number
        path - the path string
        query - the query string
        Throws:
        URIException - If the new URI cannot be created.
        See Also:
        getDefaultProtocolCharset()
      • URI

        public URI​(java.lang.String scheme,
                   java.lang.String userinfo,
                   java.lang.String host,
                   int port,
                   java.lang.String path,
                   java.lang.String query,
                   java.lang.String fragment)
            throws URIException
        Construct a general URI from the given components.
        Parameters:
        scheme - the scheme string
        userinfo - the userinfo string
        host - the host string
        port - the port number
        path - the path string
        query - the query string
        fragment - the fragment string
        Throws:
        URIException - If the new URI cannot be created.
        See Also:
        getDefaultProtocolCharset()
      • URI

        public URI​(java.lang.String scheme,
                   java.lang.String host,
                   java.lang.String path,
                   java.lang.String fragment)
            throws URIException
        Construct a general URI from the given components.
        Parameters:
        scheme - the scheme string
        host - the host string
        path - the path string
        fragment - the fragment string
        Throws:
        URIException - If the new URI cannot be created.
        See Also:
        getDefaultProtocolCharset()
      • URI

        public URI​(URI base,
                   java.lang.String relative)
            throws URIException
        Deprecated.
        Use #URI(URI, String, boolean)
        Construct a general URI with the given relative URI string.
        Parameters:
        base - the base URI
        relative - the relative URI string
        Throws:
        URIException - If the new URI cannot be created.
      • URI

        public URI​(URI base,
                   java.lang.String relative,
                   boolean escaped)
            throws URIException
        Construct a general URI with the given relative URI string.
        Parameters:
        base - the base URI
        relative - the relative URI string
        escaped - true if URI character sequence is in escaped form. false otherwise.
        Throws:
        URIException - If the new URI cannot be created.
        Since:
        3.0
      • URI

        public URI​(URI base,
                   URI relative)
            throws URIException
        Construct a general URI with the given relative URI.

           URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
           relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]
         

        Resolving Relative References to Absolute Form. Examples of Resolving Relative URI References Within an object with a well-defined base URI of

           http://a/b/c/d;p?q
         

        the relative URI would be resolved as follows: Normal Examples

           g:h           =  g:h
           g             =  http://a/b/c/g
           ./g           =  http://a/b/c/g
           g/            =  http://a/b/c/g/
           /g            =  http://a/g
           //g           =  http://g
           ?y            =  http://a/b/c/?y
           g?y           =  http://a/b/c/g?y
           #s            =  (current document)#s
           g#s           =  http://a/b/c/g#s
           g?y#s         =  http://a/b/c/g?y#s
           ;x            =  http://a/b/c/;x
           g;x           =  http://a/b/c/g;x
           g;x?y#s       =  http://a/b/c/g;x?y#s
           .             =  http://a/b/c/
           ./            =  http://a/b/c/
           ..            =  http://a/b/
           ../           =  http://a/b/
           ../g          =  http://a/b/g
           ../..         =  http://a/
           ../../        =  http://a/ 
           ../../g       =  http://a/g
         

        Some URI schemes do not allow a hierarchical syntax matching the syntax, and thus cannot use relative references.

        Parameters:
        base - the base URI
        relative - the relative URI
        Throws:
        URIException - If the new URI cannot be created.
    • Method Detail

      • encode

        protected static char[] encode​(java.lang.String original,
                                       java.util.BitSet allowed,
                                       java.lang.String charset)
                                throws URIException
        Encodes URI string. This is a two mapping, one from original characters to octets, and subsequently a second from octets to URI characters:

           original character sequence->octet sequence->URI character sequence
         

        An escaped octet is encoded as a character triplet, consisting of the percent character "%" followed by the two hexadecimal digits representing the octet code. For example, "%20" is the escaped encoding for the US-ASCII space character.

        Conversion from the local filesystem character set to UTF-8 will normally involve a two step process. First convert the local character set to the UCS; then convert the UCS to UTF-8. The first step in the process can be performed by maintaining a mapping table that includes the local character set code and the corresponding UCS code. The next step is to convert the UCS character code to the UTF-8 encoding.

        Mapping between vendor codepages can be done in a very similar manner as described above.

        The only time escape encodings can allowedly be made is when a URI is being created from its component parts. The escape and validate methods are internally performed within this method.

        Parameters:
        original - the original character sequence
        allowed - those characters that are allowed within a component
        charset - the protocol charset
        Returns:
        URI character sequence
        Throws:
        URIException - null component or unsupported character encoding
      • decode

        protected static java.lang.String decode​(char[] component,
                                                 java.lang.String charset)
                                          throws URIException
        Decodes URI encoded string. This is a two mapping, one from URI characters to octets, and subsequently a second from octets to original characters:

           URI character sequence->octet sequence->original character sequence
         

        A URI must be separated into its components before the escaped characters within those components can be allowedly decoded.

        Notice that there is a chance that URI characters that are non UTF-8 may be parsed as valid UTF-8. A recent non-scientific analysis found that EUC encoded Japanese words had a 2.7% false reading; SJIS had a 0.0005% false reading; other encoding such as ASCII or KOI-8 have a 0% false reading.

        The percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI.

        The unescape method is internally performed within this method.

        Parameters:
        component - the URI character sequence
        charset - the protocol charset
        Returns:
        original character sequence
        Throws:
        URIException - incomplete trailing escape pattern or unsupported character encoding
      • decode

        protected static java.lang.String decode​(java.lang.String component,
                                                 java.lang.String charset)
                                          throws URIException
        Decodes URI encoded string. This is a two mapping, one from URI characters to octets, and subsequently a second from octets to original characters:

           URI character sequence->octet sequence->original character sequence
         

        A URI must be separated into its components before the escaped characters within those components can be allowedly decoded.

        Notice that there is a chance that URI characters that are non UTF-8 may be parsed as valid UTF-8. A recent non-scientific analysis found that EUC encoded Japanese words had a 2.7% false reading; SJIS had a 0.0005% false reading; other encoding such as ASCII or KOI-8 have a 0% false reading.

        The percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI.

        The unescape method is internally performed within this method.

        Parameters:
        component - the URI character sequence
        charset - the protocol charset
        Returns:
        original character sequence
        Throws:
        URIException - incomplete trailing escape pattern or unsupported character encoding
        Since:
        3.0
      • prevalidate

        protected boolean prevalidate​(java.lang.String component,
                                      java.util.BitSet disallowed)
        Pre-validate the unescaped URI string within a specific component.
        Parameters:
        component - the component string within the component
        disallowed - those characters disallowed within the component
        Returns:
        if true, it doesn't have the disallowed characters if false, the component is undefined or an incorrect one
      • validate

        protected boolean validate​(char[] component,
                                   java.util.BitSet generous)
        Validate the URI characters within a specific component. The component must be performed after escape encoding. Or it doesn't include escaped characters.
        Parameters:
        component - the characters sequence within the component
        generous - those characters that are allowed within a component
        Returns:
        if true, it's the correct URI character sequence
      • validate

        protected boolean validate​(char[] component,
                                   int soffset,
                                   int eoffset,
                                   java.util.BitSet generous)
        Validate the URI characters within a specific component. The component must be performed after escape encoding. Or it doesn't include escaped characters.

        It's not that much strict, generous. The strict validation might be performed before being called this method.

        Parameters:
        component - the characters sequence within the component
        soffset - the starting offset of the given component
        eoffset - the ending offset of the given component if -1, it means the length of the component
        generous - those characters that are allowed within a component
        Returns:
        if true, it's the correct URI character sequence
      • parseUriReference

        protected void parseUriReference​(java.lang.String original,
                                         boolean escaped)
                                  throws URIException
        In order to avoid any possilbity of conflict with non-ASCII characters, Parse a URI reference as a String with the character encoding of the local system or the document.

        The following line is the regular expression for breaking-down a URI reference into its components.

           ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
            12            3  4          5       6  7        8 9
         

        For example, matching the above expression to http://jakarta.apache.org/ietf/uri/#Related results in the following subexpression matches:

                       $1 = http:
          scheme    =  $2 = http
                       $3 = //jakarta.apache.org
          authority =  $4 = jakarta.apache.org
          path      =  $5 = /ietf/uri/
                       $6 = 
          query     =  $7 = 
                       $8 = #Related
          fragment  =  $9 = Related
         

        Parameters:
        original - the original character sequence
        escaped - true if original is escaped
        Throws:
        URIException - If an error occurs.
      • indexFirstOf

        protected int indexFirstOf​(java.lang.String s,
                                   java.lang.String delims)
        Get the earlier index that to be searched for the first occurrance in one of any of the given string.
        Parameters:
        s - the string to be indexed
        delims - the delimiters used to index
        Returns:
        the earlier index if there are delimiters
      • indexFirstOf

        protected int indexFirstOf​(java.lang.String s,
                                   java.lang.String delims,
                                   int offset)
        Get the earlier index that to be searched for the first occurrance in one of any of the given string.
        Parameters:
        s - the string to be indexed
        delims - the delimiters used to index
        offset - the from index
        Returns:
        the earlier index if there are delimiters
      • indexFirstOf

        protected int indexFirstOf​(char[] s,
                                   char delim)
        Get the earlier index that to be searched for the first occurrance in one of any of the given array.
        Parameters:
        s - the character array to be indexed
        delim - the delimiter used to index
        Returns:
        the ealier index if there are a delimiter
      • indexFirstOf

        protected int indexFirstOf​(char[] s,
                                   char delim,
                                   int offset)
        Get the earlier index that to be searched for the first occurrance in one of any of the given array.
        Parameters:
        s - the character array to be indexed
        delim - the delimiter used to index
        offset - The offset.
        Returns:
        the ealier index if there is a delimiter
      • parseAuthority

        protected void parseAuthority​(java.lang.String original,
                                      boolean escaped)
                               throws URIException
        Parse the authority component.
        Parameters:
        original - the original character sequence of authority component
        escaped - true if original is escaped
        Throws:
        URIException - If an error occurs.
      • setURI

        protected void setURI()
        Once it's parsed successfully, set this URI.
        See Also:
        getRawURI()
      • isAbsoluteURI

        public boolean isAbsoluteURI()
        Tell whether or not this URI is absolute.
        Returns:
        true iif this URI is absoluteURI
      • isRelativeURI

        public boolean isRelativeURI()
        Tell whether or not this URI is relative.
        Returns:
        true iif this URI is relativeURI
      • isHierPart

        public boolean isHierPart()
        Tell whether or not the absoluteURI of this URI is hier_part.
        Returns:
        true iif the absoluteURI is hier_part
      • isOpaquePart

        public boolean isOpaquePart()
        Tell whether or not the absoluteURI of this URI is opaque_part.
        Returns:
        true iif the absoluteURI is opaque_part
      • isNetPath

        public boolean isNetPath()
        Tell whether or not the relativeURI or heir_part of this URI is net_path. It's the same function as the has_authority() method.
        Returns:
        true iif the relativeURI or heir_part is net_path
        See Also:
        hasAuthority()
      • isAbsPath

        public boolean isAbsPath()
        Tell whether or not the relativeURI or hier_part of this URI is abs_path.
        Returns:
        true iif the relativeURI or hier_part is abs_path
      • isRelPath

        public boolean isRelPath()
        Tell whether or not the relativeURI of this URI is rel_path.
        Returns:
        true iif the relativeURI is rel_path
      • hasAuthority

        public boolean hasAuthority()
        Tell whether or not this URI has authority. It's the same function as the is_net_path() method.
        Returns:
        true iif this URI has authority
        See Also:
        isNetPath()
      • isRegName

        public boolean isRegName()
        Tell whether or not the authority component of this URI is reg_name.
        Returns:
        true iif the authority component is reg_name
      • isServer

        public boolean isServer()
        Tell whether or not the authority component of this URI is server.
        Returns:
        true iif the authority component is server
      • hasUserinfo

        public boolean hasUserinfo()
        Tell whether or not this URI has userinfo.
        Returns:
        true iif this URI has userinfo
      • isHostname

        public boolean isHostname()
        Tell whether or not the host part of this URI is hostname.
        Returns:
        true iif the host part is hostname
      • isIPv4address

        public boolean isIPv4address()
        Tell whether or not the host part of this URI is IPv4address.
        Returns:
        true iif the host part is IPv4address
      • isIPv6reference

        public boolean isIPv6reference()
        Tell whether or not the host part of this URI is IPv6reference.
        Returns:
        true iif the host part is IPv6reference
      • hasQuery

        public boolean hasQuery()
        Tell whether or not this URI has query.
        Returns:
        true iif this URI has query
      • hasFragment

        public boolean hasFragment()
        Tell whether or not this URI has fragment.
        Returns:
        true iif this URI has fragment
      • setDefaultProtocolCharset

        public static void setDefaultProtocolCharset​(java.lang.String charset)
                                              throws URI.DefaultCharsetChanged
        Set the default charset of the protocol.

        The character set used to store files SHALL remain a local decision and MAY depend on the capability of local operating systems. Prior to the exchange of URIs they SHOULD be converted into a ISO/IEC 10646 format and UTF-8 encoded. This approach, while allowing international exchange of URIs, will still allow backward compatibility with older systems because the code set positions for ASCII characters are identical to the one byte sequence in UTF-8.

        An individual URI scheme may require a single charset, define a default charset, or provide a way to indicate the charset used.

        Always all the time, the setter method is always succeeded and throws DefaultCharsetChanged exception. So API programmer must follow the following way:

          import org.apache.util.URI$DefaultCharsetChanged;
              .
              .
              .
          try {
              URI.setDefaultProtocolCharset("UTF-8");
          } catch (DefaultCharsetChanged cc) {
              // CASE 1: the exception could be ignored, when it is set by user
              if (cc.getReasonCode() == DefaultCharsetChanged.PROTOCOL_CHARSET) {
              // CASE 2: let user know the default protocol charset changed
              } else {
              // CASE 2: let user know the default document charset changed
              }
          }
          
        The API programmer is responsible to set the correct charset. And each application should remember its own charset to support.
        Parameters:
        charset - the default charset for each protocol
        Throws:
        URI.DefaultCharsetChanged - default charset changed
      • getDefaultProtocolCharset

        public static java.lang.String getDefaultProtocolCharset()
        Get the default charset of the protocol.

        An individual URI scheme may require a single charset, define a default charset, or provide a way to indicate the charset used.

        To work globally either requires support of a number of character sets and to be able to convert between them, or the use of a single preferred character set. For support of global compatibility it is STRONGLY RECOMMENDED that clients and servers use UTF-8 encoding when exchanging URIs.

        Returns:
        the default charset string
      • getProtocolCharset

        public java.lang.String getProtocolCharset()
        Get the protocol charset used by this current URI instance. It was set by the constructor for this instance. If it was not set by contructor, it will return the default protocol charset.
        Returns:
        the protocol charset string
        See Also:
        getDefaultProtocolCharset()
      • setDefaultDocumentCharset

        public static void setDefaultDocumentCharset​(java.lang.String charset)
                                              throws URI.DefaultCharsetChanged
        Set the default charset of the document.

        Notice that it will be possible to contain mixed characters (e.g. ftp://host/KoreanNamespace/ChineseResource). To handle the Bi-directional display of these character sets, the protocol charset could be simply used again. Because it's not yet implemented that the insertion of BIDI control characters at different points during composition is extracted.

        Always all the time, the setter method is always succeeded and throws DefaultCharsetChanged exception. So API programmer must follow the following way:

          import org.apache.util.URI$DefaultCharsetChanged;
              .
              .
              .
          try {
              URI.setDefaultDocumentCharset("EUC-KR");
          } catch (DefaultCharsetChanged cc) {
              // CASE 1: the exception could be ignored, when it is set by user
              if (cc.getReasonCode() == DefaultCharsetChanged.DOCUMENT_CHARSET) {
              // CASE 2: let user know the default document charset changed
              } else {
              // CASE 2: let user know the default protocol charset changed
              }
          }
          
        The API programmer is responsible to set the correct charset. And each application should remember its own charset to support.
        Parameters:
        charset - the default charset for the document
        Throws:
        URI.DefaultCharsetChanged - default charset changed
      • getDefaultDocumentCharset

        public static java.lang.String getDefaultDocumentCharset()
        Get the recommended default charset of the document.
        Returns:
        the default charset string
      • getDefaultDocumentCharsetByLocale

        public static java.lang.String getDefaultDocumentCharsetByLocale()
        Get the default charset of the document by locale.
        Returns:
        the default charset string by locale
      • getDefaultDocumentCharsetByPlatform

        public static java.lang.String getDefaultDocumentCharsetByPlatform()
        Get the default charset of the document by platform.
        Returns:
        the default charset string by platform
      • getRawScheme

        public char[] getRawScheme()
        Get the scheme.
        Returns:
        the scheme
      • getScheme

        public java.lang.String getScheme()
        Get the scheme.
        Returns:
        the scheme null if undefined scheme
      • setRawAuthority

        public void setRawAuthority​(char[] escapedAuthority)
                             throws URIException,
                                    java.lang.NullPointerException
        Set the authority. It can be one type of server, hostport, hostname, IPv4address, IPv6reference and reg_name.

           authority     = server | reg_name
         

        Parameters:
        escapedAuthority - the raw escaped authority
        Throws:
        URIException - If parseAuthority(java.lang.String,boolean) fails
        java.lang.NullPointerException - null authority
      • setEscapedAuthority

        public void setEscapedAuthority​(java.lang.String escapedAuthority)
                                 throws URIException
        Set the authority. It can be one type of server, hostport, hostname, IPv4address, IPv6reference and reg_name. Note that there is no setAuthority method by the escape encoding reason.
        Parameters:
        escapedAuthority - the escaped authority string
        Throws:
        URIException - If parseAuthority(java.lang.String,boolean) fails
      • getRawAuthority

        public char[] getRawAuthority()
        Get the raw-escaped authority.
        Returns:
        the raw-escaped authority
      • getEscapedAuthority

        public java.lang.String getEscapedAuthority()
        Get the escaped authority.
        Returns:
        the escaped authority
      • getRawUserinfo

        public char[] getRawUserinfo()
        Get the raw-escaped userinfo.
        Returns:
        the raw-escaped userinfo
        See Also:
        getAuthority()
      • getEscapedUserinfo

        public java.lang.String getEscapedUserinfo()
        Get the escaped userinfo.
        Returns:
        the escaped userinfo
        See Also:
        getAuthority()
      • getRawHost

        public char[] getRawHost()
        Get the host.

           host          = hostname | IPv4address | IPv6reference
         

        Returns:
        the host
        See Also:
        getAuthority()
      • getPort

        public int getPort()
        Get the port. In order to get the specfic default port, the specific protocol-supported class extended from the URI class should be used. It has the server-based naming authority.
        Returns:
        the port if -1, it has the default port for the scheme or the server-based naming authority is not supported in the specific URI.
      • resolvePath

        protected char[] resolvePath​(char[] basePath,
                                     char[] relPath)
                              throws URIException
        Resolve the base and relative path.
        Parameters:
        basePath - a character array of the basePath
        relPath - a character array of the relPath
        Returns:
        the resolved path
        Throws:
        URIException - no more higher path level to be resolved
      • getRawCurrentHierPath

        protected char[] getRawCurrentHierPath​(char[] path)
                                        throws URIException
        Get the raw-escaped current hierarchy level in the given path. If the last namespace is a collection, the slash mark ('/') should be ended with at the last character of the path string.
        Parameters:
        path - the path
        Returns:
        the current hierarchy level
        Throws:
        URIException - no hierarchy level
      • getRawPath

        public char[] getRawPath()
        Get the raw-escaped path.

           path          = [ abs_path | opaque_part ]
         

        Returns:
        the raw-escaped path
      • getEscapedPath

        public java.lang.String getEscapedPath()
        Get the escaped path.

           path          = [ abs_path | opaque_part ]
           abs_path      = "/"  path_segments 
           opaque_part   = uric_no_slash *uric
         

        Returns:
        the escaped path string
      • getRawName

        public char[] getRawName()
        Get the raw-escaped basename of the path.
        Returns:
        the raw-escaped basename
      • getEscapedName

        public java.lang.String getEscapedName()
        Get the escaped basename of the path.
        Returns:
        the escaped basename string
      • getRawPathQuery

        public char[] getRawPathQuery()
        Get the raw-escaped path and query.
        Returns:
        the raw-escaped path and query
      • getEscapedPathQuery

        public java.lang.String getEscapedPathQuery()
        Get the escaped query.
        Returns:
        the escaped path and query string
      • setRawQuery

        public void setRawQuery​(char[] escapedQuery)
                         throws URIException
        Set the raw-escaped query.
        Parameters:
        escapedQuery - the raw-escaped query
        Throws:
        URIException - escaped query not valid
      • setEscapedQuery

        public void setEscapedQuery​(java.lang.String escapedQuery)
                             throws URIException
        Set the escaped query string.
        Parameters:
        escapedQuery - the escaped query string
        Throws:
        URIException - escaped query not valid
      • setQuery

        public void setQuery​(java.lang.String query)
                      throws URIException
        Set the query.

        When a query string is not misunderstood the reserved special characters ("&", "=", "+", ",", and "$") within a query component, it is recommended to use in encoding the whole query with this method.

        The additional APIs for the special purpose using by the reserved special characters used in each protocol are implemented in each protocol classes inherited from URI. So refer to the same-named APIs implemented in each specific protocol instance.

        Parameters:
        query - the query string.
        Throws:
        URIException - incomplete trailing escape pattern or unsupported character encoding
        See Also:
        encode(java.lang.String, java.util.BitSet, java.lang.String)
      • getRawQuery

        public char[] getRawQuery()
        Get the raw-escaped query.
        Returns:
        the raw-escaped query
      • getEscapedQuery

        public java.lang.String getEscapedQuery()
        Get the escaped query.
        Returns:
        the escaped query string
      • setRawFragment

        public void setRawFragment​(char[] escapedFragment)
                            throws URIException
        Set the raw-escaped fragment.
        Parameters:
        escapedFragment - the raw-escaped fragment
        Throws:
        URIException - escaped fragment not valid
      • setEscapedFragment

        public void setEscapedFragment​(java.lang.String escapedFragment)
                                throws URIException
        Set the escaped fragment string.
        Parameters:
        escapedFragment - the escaped fragment string
        Throws:
        URIException - escaped fragment not valid
      • setFragment

        public void setFragment​(java.lang.String fragment)
                         throws URIException
        Set the fragment.
        Parameters:
        fragment - the fragment string.
        Throws:
        URIException - If an error occurs.
      • getRawFragment

        public char[] getRawFragment()
        Get the raw-escaped fragment.

        The optional fragment identifier is not part of a URI, but is often used in conjunction with a URI.

        The format and interpretation of fragment identifiers is dependent on the media type [RFC2046] of the retrieval result.

        A fragment identifier is only meaningful when a URI reference is intended for retrieval and the result of that retrieval is a document for which the identified fragment is consistently defined.

        Returns:
        the raw-escaped fragment
      • getEscapedFragment

        public java.lang.String getEscapedFragment()
        Get the escaped fragment.
        Returns:
        the escaped fragment string
      • removeFragmentIdentifier

        protected char[] removeFragmentIdentifier​(char[] component)
        Remove the fragment identifier of the given component.
        Parameters:
        component - the component that a fragment may be included
        Returns:
        the component that the fragment identifier is removed
      • normalize

        protected char[] normalize​(char[] path)
                            throws URIException
        Normalize the given hier path part.

        Algorithm taken from URI reference parser at http://www.apache.org/~fielding/uri/rev-2002/issues.html.

        Parameters:
        path - the path to normalize
        Returns:
        the normalized path
        Throws:
        URIException - no more higher path level to be normalized
      • normalize

        public void normalize()
                       throws URIException
        Normalizes the path part of this URI. Normalization is only meant to be performed on URIs with an absolute path. Calling this method on a relative path URI will have no effect.
        Throws:
        URIException - no more higher path level to be normalized
        See Also:
        isAbsPath()
      • equals

        protected boolean equals​(char[] first,
                                 char[] second)
        Test if the first array is equal to the second array.
        Parameters:
        first - the first character array
        second - the second character array
        Returns:
        true if they're equal
      • equals

        public boolean equals​(java.lang.Object obj)
        Test an object if this URI is equal to another.
        Overrides:
        equals in class java.lang.Object
        Parameters:
        obj - an object to compare
        Returns:
        true if two URI objects are equal
      • hashCode

        public int hashCode()
        Return a hash code for this URI.
        Overrides:
        hashCode in class java.lang.Object
        Returns:
        a has code value for this URI
      • compareTo

        public int compareTo​(java.lang.Object obj)
                      throws java.lang.ClassCastException
        Compare this URI to another object.
        Specified by:
        compareTo in interface java.lang.Comparable
        Parameters:
        obj - the object to be compared.
        Returns:
        0, if it's same, -1, if failed, first being compared with in the authority component
        Throws:
        java.lang.ClassCastException - not URI argument
      • clone

        public java.lang.Object clone()
                               throws java.lang.CloneNotSupportedException
        Create and return a copy of this object, the URI-reference containing the userinfo component. Notice that the whole URI-reference including the userinfo component counld not be gotten as a String.

        To copy the identical URI object including the userinfo component, it should be used.

        Overrides:
        clone in class java.lang.Object
        Returns:
        a clone of this instance
        Throws:
        java.lang.CloneNotSupportedException
      • getRawURI

        public char[] getRawURI()
        It can be gotten the URI character sequence. It's raw-escaped. For the purpose of the protocol to be transported, it will be useful.

        It is clearly unwise to use a URL that contains a password which is intended to be secret. In particular, the use of a password within the 'userinfo' component of a URL is strongly disrecommended except in those rare cases where the 'password' parameter is intended to be public.

        When you want to get each part of the userinfo, you need to use the specific methods in the specific URL. It depends on the specific URL.

        Returns:
        the URI character sequence
      • getEscapedURI

        public java.lang.String getEscapedURI()
        It can be gotten the URI character sequence. It's escaped. For the purpose of the protocol to be transported, it will be useful.
        Returns:
        the escaped URI string
      • getRawURIReference

        public char[] getRawURIReference()
        Get the URI reference character sequence.
        Returns:
        the URI reference character sequence
      • getEscapedURIReference

        public java.lang.String getEscapedURIReference()
        Get the escaped URI reference string.
        Returns:
        the escaped URI reference string
      • toString

        public java.lang.String toString()
        Get the escaped URI string.

        On the document, the URI-reference form is only used without the userinfo component like http://jakarta.apache.org/ by the security reason. But the URI-reference form with the userinfo component could be parsed.

        In other words, this URI and any its subclasses must not expose the URI-reference expression with the userinfo component like http://user:password@hostport/restricted_zone.
        It means that the API client programmer should extract each user and password to access manually. Probably it will be supported in the each subclass, however, not a whole URI-reference expression.

        Overrides:
        toString in class java.lang.Object
        Returns:
        the escaped URI string
        See Also:
        clone()