K&K Software Lynxviewer

Ergebnis für URL: http://pub.ks-and-ks.ne.jp/prog/libmoe/
   [1] [2]
     ____________________________________________________________________________

  Functions to handle multiple octets character encoding scheme
     ____________________________________________________________________________

     [3]libmoe-1.5.8.tar.gz (1523KB, 2004-11-21 18:34:48)

   is gzipped tarball of a collection of functions to handle sequences of characters
   consisting of multiple octets. It includes [4]a character encoding conversion
   tool which is initially written for debugging purpose of this library. In spite
   of my initial intention, I believe that it is very useful tool. You can view
   ChangeLog:

     [5]libmoe-1.5.8-ChangeLog.txt (31KB, 2004-11-21 18:34:48)

   which is included in the above tarball.

   The developement version:

     [6]libmoe-devel.tar.gz (1539KB, 2004-11-21 18:36:19)

   and its ChangeLog

     [7]libmoe-devel-ChangeLog.txt (32KB, 2004-11-21 18:36:19)

   are also available.

   The main functionalities are to calculate from a character encoded in multiple
   octet, a non-negative integer, which is called Universal Code Point (UCP) for
   convinience of description in this document, including complete information about
   coded character set containing the character and codepoint of the character in
   the set, and to reproduce the orignal octet sequence from the integer.
     ____________________________________________________________________________

  Requirement

   To build and install this library, you need C compiler and libraries conforming
   to ANSI standard. Further
     * the "int" of your cc must have 32-bit length at least,
     * your stdio library must have functions "fileno()" and "fdopen()",
     * if you are going to use the included Makefile, you need GNU Make, GNU
       binutils and GNU C compiler supporting shared objects.

   I strongly recommend to use GNU C compiler and GNU Make.

   If you build with the included Makefile, you need to tell to your dynamic linker,
   the directory (/usr/local/lib) in which the shared library is installed.

   If you are installing on a Linux box for example, add the line
/usr/local/lib

   to the file /etc/ld.so.conf unless it already contains such line, and then issue
   the command
/sbin/ldconfig
     ____________________________________________________________________________

  Acceptable encodings

   This library can handle the following subset of the ISO 2022 escape sequences:
     * designating an ISO 2022 registered character set on a intermediate buffer,
     * designating UTF-8,
     * return from UTF-8,
     * locking shift,
     * 7bit single shift by 1/11 4/14 or 1/11 4/15,
     * 8bit single shift by 8/14 or 8/15.

   Further it can handle the following non-ISO 2022 encodings:
     * UTF-8, UTF-16, UTF-16BE, UTF-16LE
     * [8]X-MOE-INTERNAL,
     * Shift_JIS,
     * Big Five,
     * EUC-tw,
     * GBK, GB 18030-2000 (a.k.a. GBK2K)
     * Johab,
     * Unified Hangul,
     * KOI8-R,
     * KOI8-U,
     * Microsoft Windows Codepages 1250 -- 1258.

   Characters with these encodings can be inserted into ISO 2022 encoded character
   sequences with leading escape sequence

     1/11 2/5 2/1 2/X 3/Y

   and trailing

     1/11 2/5 4/0

   where X * 0x10 + Y are integers assigned to encodings by the library.
     ____________________________________________________________________________

  Universal Code Point

   The library classifies the coded character set (CCS) into 6 categories
     * Unicode,
     * 94 set in ISO 2022,
     * 96 set in ISO 2022,
     * 7 bit set not in ISO 2022,
     * 94x94 set, and
     * 15 bit set not in ISO 2022.

   The characters in Unicode are assigned the same codepoints as in Unicode.

   For a character in an other CCS, it is somewhat difficult to describe how to
   determine UCP in natural language. Roughly speaking, we order all the codepoints
   into one sequence, in the order of above categorization and final octet of escape
   sequences designating the CCS. The UCP is logical or of the index (staring with
   0) in the big sequence and of 1U
Usage: http://www.kk-software.de/kklynxview/get/URL
e.g. http://www.kk-software.de/kklynxview/get/http://www.kk-software.de
Errormessages are in German, sorry ;-)