The XML Logo 
    (from the XML FAQ)

Software for Chinese XML

This page is under construction!

For more details and contact details, check www.xmlsoftware.com Note that the following list is largely based on the published information at the vendors' websites, and not from testing. If the information was not clear from the vendor's web page, we have left it out. (Vendors: Why not put a Chinese Numberplate on your website? See the note.)

XML Parsers

Name From Numberplate Comment
SP jclark.com The 
        Chinese numberplate shows which character encodings this 
        software supports. Public. C++. UNIX/Win32. Full SGML Parser
Expat jclark.com The 
        Chinese numberplate shows which character encodings this 
        software supports. Public. C++. UNIX/Win32. Non-validating XML parser. (Used in Mozilla and PERL.)
XT jclark.com The 
        Chinese numberplate shows which character encodings this 
        software supports. Public. Java
Ƭfred Microstar The 
        Chinese numberplate shows which character encodings this 
        software supports. Public Java.
XML parser for Java IBM The 
        Chinese numberplate shows which character encodings this 
        software supports. Public. Java. Supports many different encodings, especially EBCDIC family.
Project X Sun The 
        Chinese numberplate shows which character encodings this 
        software supports. Public. Java library. Source code not available? Said to support 120 different encodings.
DXP Datachannel The 
        Chinese numberplate shows which character encodings this 
        software supports. Java
LT Language Technology Group The 
        'Chinese Numberplate' shows the encodings which this 
        software supports. Public NonCom, Java
. . . .
Conforming XML processing software should support

XML-Aware Character Code Converters

Name From Numberplate Comment
. . . .

Non-XML-Aware Character Code Converters

Name From Comment
iconv XPG/UNIX vendors (standard part) Source code not freely available. UNIX. Standard Utility: Some international versions include East Asian character sets
tcs Bell Labs, Plan 9 Public noncom? Source code freely available. Includes converters for Chinese sets. Part of some Linux distributions. (We have used this to make lossless transcoders: course code available on request.)
trans . Public noncom. C. Generates C transcoders for many 8-bit character codes--but not Chinese.
utf7 Ross Paterson and Guongjin Public. C. Converts from many characters sets to utf-8 or utf-7.
native2unicode Sun Public Java. (I am not sure what happens to missing characters.)

Useful Source Code

Name From Numberplate Comment
ure psf??? . C?. String libraries which can handle UTF-8.

XML-Aware Text Processing Applications

Name From Numberplate Comment
Perl . . Public. Cryptic.
OmniMark omnimark.com . Limited. 4GL. Many many platforms.
Balise . . .
LT XML Utilities Language Technology Group The 
        Chinese numberplate shows which character encodings this 
        software supports. Public Noncom, many utilities such as xml versions of grep and sort.

Non-XML-Aware Text Processing Applications

Name From Numberplate Comment
agrep . . Public? Authors claim that shortest-string pattern matching is more suitable for text processing structured markup: longest-string means that /<x>.*<\\x>/ will match all of "<x>zz</x>zz<x>zz</x>"
sort GNU, or standard UNIX utility. . NT version may be OK.

XML-Aware Applications

Name From Numberplate Comment
FrameMaker+SGML 5.5.6 Adobe The 
        Chinese numberplate shows which character encodings this 
        software supports. You can change the XML character set in which files will be saved in the File>Utilities>Application menu defaults. Supports Chinese input and typesetting. Remember to change stylesheet to select fonts which have glyphs for Chinese; otherwise you will get strange sequences. I don't think selecting character set is as flexible as it is for the Microsoft products, maybe Adobe are treating it as something that needs to be set up as part of installation and not a user-option.
Adept Editor ArborText The 
        Chinese numberplate shows which character encodings this 
        software supports.? Supports Chinese input and typesetting.
XML media server ICO The 
        Chinese numberplate shows which character encodings this 
        software supports.? Provides XML-based web server on top of IBM's media server.
DynaText Inso The 
        Chinese numberplate shows which character encodings this 
        software supports.? DynaText browser supports many different character sets. The new DynaText Technicians edition is uses technology from General Dynamics TechSight (see next)
Techsight General Dynamics (GDDS) The 
        Chinese numberplate shows which character encodings this 
        software supports.? Soon to be superceded by DynaText/TE (see above). This seems to accept Chinese characters OK in Big5: but I think this version just accepts strings, and does not understand the character set. (There may be an issue with headings: TechSight seems to uppercase all headings, which corrupts the display of Big5 characters which have [a-z] as their second byte; this may just be a stylesheet problem.)
Internet Explorer 5 Microsoft The 
        Chinese numberplate shows which character encodings this 
        software supports.

The English language version of Internet Explorer 5 (beta 5.00.0910.1309) running on an English Windows 95, was able to display the XML test files correctly. The Chinese language version of the IE5 (same number) running on tradiaional Chinese Window98 could not display the same XML files. We are looking into this. Chinese users should be careful: after a bad experience here, do not download English IE5beta onto Chinese Windows--it can cause trouble. (Also, the IE5 beta did seemed to choke on the "standalone" attribute in the XML encoding PI and "charset" in the stylesheet PI.)

 

[Legal Notices]