Util/Punycode.php
- Classes
- Kwf_Util_Punycode
\Kwf_Util_Punycode
- author
- Matthias Sommerfeld <mso@phlylabs.de>
- author
- Leonid Kogan <lko@neuse.de>
- copyright
- 2004-2010 phlyLabs Berlin, http://phlylabs.de
- version
- 0.6.9 2010-11-04
 Properties Properties
- $NP
- $_allow_overlong
- $_api_encoding
- $_base
- $_damp
- $_encode_german_sz
- $_error
- $_initial_bias
- $_initial_n
- $_invalid_ucs
- $_lbase
- $_lcount
- $_max_ucs
- $_ncount
- $_punycode_prefix
- $_sbase
- $_scount
- $_skew
- $_strict_mode
- $_tbase
- $_tcount
- $_tmax
- $_tmin
- $_vbase
- $_vcount
 Methods Methods
- __construct
- _adapt
- _apply_cannonical_ordering
- _combine
- _decode
- _decode_digit
- _encode
- _encode_digit
- _error
- _get_combining_class
- _hangul_compose
- _hangul_decompose
- _nameprep
- _ucs4_string_to_ucs4
- _ucs4_to_ucs4_string
- _ucs4_to_utf8
- _utf8_to_ucs4
- decode
- encode
- encode_uri
- get_last_error
- set_parameter
Description
Encode/decode Internationalized Domain Names.
The class allows to convert internationalized domain names (see RFC 3490 for details) as they can be used with various registries worldwide to be translated between their original (localized) form and their encoded form as it will be used in the DNS (Domain Name System).
The class provides two public methods, encode() and decode(), which do exactly what you would expect them to do. You are allowed to use complete domain names, simple strings and complete email addresses as well. That means, that you might use any of the following notations:
- www.nörgler.com
- xn--nrgler-wxa
- xn--brse-5qa.xn--knrz-1ra.info
Unicode input might be given as either UTF-8 string, UCS-4 string or UCS-4 array. Unicode output is available in the same formats. You can select your preferred format via {@link set_paramter()}.
ACE input and output is always expected to be ASCII.
Properties
$NP
 $NP = 'array'Holds all relevant mapping tables See RFC3454 for details
 Details
 Details
- visibility
- protected
- default
- array
- final
- false
- static
- false
- private
- array
- since
- 0.5.2
$_allow_overlong
 $_allow_overlong = 'false'
 Details
 Details
- visibility
- protected
- default
- false
- final
- false
- static
- false
$_api_encoding
 $_api_encoding = 'utf8'
 Details
 Details
- visibility
- protected
- default
- utf8
- final
- false
- static
- false
$_base
 $_base = '36'
 Details
 Details
- visibility
- protected
- default
- 36
- final
- false
- static
- false
$_damp
 $_damp = '700'
 Details
 Details
- visibility
- protected
- default
- 700
- final
- false
- static
- false
$_encode_german_sz
 $_encode_german_sz = 'true'
 Details
 Details
- visibility
- protected
- default
- true
- final
- false
- static
- false
$_error
 $_error = 'false'
 Details
 Details
- visibility
- protected
- default
- false
- final
- false
- static
- false
$_initial_bias
 $_initial_bias = '72'
 Details
 Details
- visibility
- protected
- default
- 72
- final
- false
- static
- false
$_initial_n
 $_initial_n = '0x80'
 Details
 Details
- visibility
- protected
- default
- 0x80
- final
- false
- static
- false
$_invalid_ucs
 $_invalid_ucs = '0x80000000'
 Details
 Details
- visibility
- protected
- default
- 0x80000000
- final
- false
- static
- false
$_lbase
 $_lbase = '0x1100'
 Details
 Details
- visibility
- protected
- default
- 0x1100
- final
- false
- static
- false
$_lcount
 $_lcount = '19'
 Details
 Details
- visibility
- protected
- default
- 19
- final
- false
- static
- false
$_max_ucs
 $_max_ucs = '0x10FFFF'
 Details
 Details
- visibility
- protected
- default
- 0x10FFFF
- final
- false
- static
- false
$_ncount
 $_ncount = '588'
 Details
 Details
- visibility
- protected
- default
- 588
- final
- false
- static
- false
$_punycode_prefix
 $_punycode_prefix = 'xn--'
 Details
 Details
- visibility
- protected
- default
- xn--
- final
- false
- static
- false
$_sbase
 $_sbase = '0xAC00'
 Details
 Details
- visibility
- protected
- default
- 0xAC00
- final
- false
- static
- false
$_scount
 $_scount = '11172'
 Details
 Details
- visibility
- protected
- default
- 11172
- final
- false
- static
- false
$_skew
 $_skew = '38'
 Details
 Details
- visibility
- protected
- default
- 38
- final
- false
- static
- false
$_strict_mode
 $_strict_mode = 'false'
 Details
 Details
- visibility
- protected
- default
- false
- final
- false
- static
- false
$_tbase
 $_tbase = '0x11A7'
 Details
 Details
- visibility
- protected
- default
- 0x11A7
- final
- false
- static
- false
$_tcount
 $_tcount = '28'
 Details
 Details
- visibility
- protected
- default
- 28
- final
- false
- static
- false
$_tmax
 $_tmax = '26'
 Details
 Details
- visibility
- protected
- default
- 26
- final
- false
- static
- false
$_tmin
 $_tmin = '1'
 Details
 Details
- visibility
- protected
- default
- 1
- final
- false
- static
- false
$_vbase
 $_vbase = '0x1161'
 Details
 Details
- visibility
- protected
- default
- 0x1161
- final
- false
- static
- false
$_vcount
 $_vcount = '21'
 Details
 Details
- visibility
- protected
- default
- 21
- final
- false
- static
- false
Methods
__construct
__construct(
          array $options
              =
              false
          )
        
        :
        booleanthe constructor
Arguments
- $options
- array
 
Output
- boolean
 Details
 Details
- visibility
- public
- final
- false
- static
- false
- since
- 0.5.2
_adapt
_adapt(
          int $delta, int $npoints, int $is_first
          )
        
        :
        intAdapt the bias according to the current code point and position
Arguments
- $delta
- int
 
- $npoints
- int
 
- $is_first
- int
 
Output
- int
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_apply_cannonical_ordering
_apply_cannonical_ordering(
          array $input
          )
        
        :
        arrayApllies the cannonical ordering of a decomposed UCS4 sequence
Arguments
- $input
- array
 Decomposed UCS4 sequence
Output
- array
- Ordered USC4 sequence
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_combine
_combine(
          array $input
          )
        
        :
        arrayDo composition of a sequence of starter and non-starter
Arguments
- $input
- array
 UCS4 Decomposed sequence
Output
- array
- Ordered USC4 sequence
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_decode
_decode(
           $encoded
          )
        
        :
        mixedThe actual decoding algorithm
Arguments
- $encoded
- string
Output
- mixed
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_decode_digit
_decode_digit(
          int $cp
          )
        
        :
        intDecode a certain digit
Arguments
- $cp
- int
 
Output
- int
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_encode
_encode(
           $decoded
          )
        
        :
        mixedThe actual encoding algorithm
Arguments
- $decoded
- string
Output
- mixed
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_encode_digit
_encode_digit(
          int $d
          )
        
        :
        stringEncoding a certain digit
Arguments
- $d
- int
 
Output
- string
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_error
_error(
          string $error
          )
        
        :
        Internal error handling method
Arguments
- $error
- string
 
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_get_combining_class
_get_combining_class(
          integer $char
          )
        
        :
        integerReturns the combining class of a certain wide char
Arguments
- $char
- integer
 Wide char to check (32bit integer)
Output
- integer
- Combining class if found, else 0
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_hangul_compose
_hangul_compose(
          array $input
          )
        
        :
        arrayCcomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul
Arguments
- $input
- array
 Decomposed UCS4 sequence
Output
- array
- UCS4 sequence with syllables composed
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_hangul_decompose
_hangul_decompose(
          integer $char
          )
        
        :
        arrayDecomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul
Arguments
- $char
- integer
 32bit UCS4 code point
Output
- array
- Either Hangul Syllable decomposed or original 32bit value as one value array
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_nameprep
_nameprep(
          array $input
          )
        
        :
        stringDo Nameprep according to RFC3491 and RFC3454
Arguments
- $input
- array
 Unicode Characters
Output
- string
- Unicode Characters, Nameprep'd
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_ucs4_string_to_ucs4
_ucs4_string_to_ucs4(
          string $input
          )
        
        :
        arrayConvert UCS-4 strin into UCS-4 garray
Arguments
- $input
- string
 
Output
- array
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_ucs4_to_ucs4_string
_ucs4_to_ucs4_string(
          array $input
          )
        
        :
        stringConvert UCS-4 array into UCS-4 string
Arguments
- $input
- array
 
Output
- string
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_ucs4_to_utf8
_ucs4_to_utf8(
          string $input
          )
        
        :
        stringConvert UCS-4 string into UTF-8 string See _utf8_to_ucs4() for details
Arguments
- $input
- string
 
Output
- string
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
_utf8_to_ucs4
_utf8_to_ucs4(
          string $input
          )
        
        :
        stringThis converts an UTF-8 encoded string to its UCS-4 representation By talking about UCS-4 "strings" we mean arrays of 32bit integers representing each of the "chars". This is due to PHP not being able to handle strings with bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too.
The following UTF-8 encodings are supported: bytes bits representation 1 7 0xxxxxxx 2 11 110xxxxx 10xxxxxx 3 16 1110xxxx 10xxxxxx 10xxxxxx 4 21 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 5 26 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 6 31 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx Each x represents a bit that can be used to store character data. The five and six byte sequences are part of Annex D of ISO/IEC 10646-1:2000
Arguments
- $input
- string
 
Output
- string
 Details
 Details
- visibility
- protected
- final
- false
- static
- false
decode
decode(
          string $input,  $one_time_encoding
              =
              false
          )
        
        :
        stringDecode a given ACE domain name
Arguments
- $input
- string
 Domain name (ACE string) [@param string Desired output encoding, see {@link set_parameter}]
- $one_time_encoding
Output
- string
- Decoded Domain name (UTF-8 or UCS-4)
 Details
 Details
- visibility
- public
- final
- false
- static
- false
encode
encode(
          string $decoded,  $one_time_encoding
              =
              false
          )
        
        :
        stringEncode a given UTF-8 domain name
Arguments
- $decoded
- string
 Domain name (UTF-8 or UCS-4) [@param string Desired input encoding, see {@link set_parameter}]
- $one_time_encoding
Output
- string
- Encoded Domain name (ACE string)
 Details
 Details
- visibility
- public
- final
- false
- static
- false
encode_uri
encode_uri(
          string $uri
          )
        
        :
        stringRemoves a weakness of encode(), which cannot properly handle URIs but instead encodes their path or query components, too.
Arguments
- $uri
- string
 Expects the URI as a UTF-8 (or ASCII) string
Output
- string
- The URI encoded to Punycode, everything but the host component is left alone
 Details
 Details
- visibility
- public
- final
- false
- static
- false
- since
- 0.6.4
get_last_error
get_last_error(
          
          )
        
        :
        stringUse this method to get the last error ocurred
Output
- string
- The last error, that occured
 Details
 Details
- visibility
- public
- final
- false
- static
- false
set_parameter
set_parameter(
          mixed $option, string $value
              =
              false
          )
        
        :
        booleanSets a new option value. Available options and values: [encoding - Use either UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8, 'ucs4_string' and 'ucs4_array' respectively for UCS4); The output is always UTF-8] [overlong - Unicode does not allow unnecessarily long encodings of chars, to allow this, set this parameter to true, else to false; default is false.] [strict - true: strict mode, good for registration purposes - Causes errors on failures; false: loose mode, ideal for "wildlife" applications by silently ignoring errors and returning the original input instead
Arguments
- $option
- mixed
 Parameter to set (string: single parameter; array of Parameter => Value pairs)
- $value
- string
 Value to use (if parameter 1 is a string)
Output
- boolean
- true on success, false otherwise
 Details
 Details
- visibility
- public
- final
- false
- static
- false