Internationalization

concerning Java and XML

Ritzberger  Fritz
2006-01-06, revised 2017-11-11


Introduction

Normally two types of strings exist in applications:
  1. internal strings that need no translation into a locale language, as they do not appear on any user interface, and
  2. strings that have to be translated into the locale language on application startup because they will be shown on some user interface.
The traditional way to internationalize Java applications is to use some utility class like "Language" that stores the actual translations of any application-defined language-neutral string. This requires the wrapping of any string into some translation-call like "Language.get(neutralString)", and resource-loading of property files that contain the translations (see "Traditional" example below).

When you build a GUI from some XML specification (e.g. XUL), that XML also will hold language-specific strings which have to be translated by the building source-code. This requires hardcoding the structure of the GUI specification: element and attribute names have to be duplicated as source-code strings to be able to access the XML.

In other words:
But these XML files do not only hold language-specific texts, normally there is a lot of semantic, too. In case the schema (structure) of that XML changes, also the processing Java code has to be adapted. This is a classical maintainance problem. The more XML is used to externalize things, the more this aspect gets important.

Wouldn't it be nice to have internationalization on XML level, and maybe use XML instead of property files and resource bundles? Besides, the Java Properties class does not support encodings, so eastern languages have to be written using escape-sequences, which is not very readable. XML supports an encoding declaration, so you can get rid of such cryptography.

By the means of an XSLT processor (contained in JDK since 1.4), an internationalization as pure XML solution (without Java!) is possible, see code below. This concept uses a separate XML file that contains only translations.

The following Java solution provides the same naming conventions as Java resource bundles do: a file named strings.xml will be translated by searching for strings_de.xml, strings_fr.xml, ..., according to the platform locale. Each of these translation XML files can have its own encoding. This eases a translation by some third party: you send the party e.g. an English file strings_en.xml, and you get back the Chinese translation strings_ch.xml, without having to care about the encoding they use. Just be aware that not any GUI font can render Chinese letters!

Traditional way to internationalize applications (using Java):

strings.properties:

Cancel=Default text for Cancel

strings_de.properties:

Cancel=Abbrechen

Application.java:

String neutralButtonLabel = "Cancel";
JButton button = new JButton(Language.singleton.get(neutralButtonLabel));
toolbar.add(button);

Language.java:

import java.util.*;

/**
 * Parse strings.properties, put it into some Map.
 */
public class Language
{
    public static final Language singleton = new Language("strings.properties", Locale.getDefault());

    private ResourceBundle translations;

    private Language(String fileName, Locale locale)    {
       translations = ResourceBundle.getBundle(fileName, locale);
    }

    public String get(String neutralString)   {
       return translations.getString(neutralString);
    }
}

XML internationalization via XSLT:

strings.xml:

Each XML element that has to be translated must have an document-unique "id" attribute.

<?xml version="1.0"?>
<strings>
    <string id="Cancel">Default text for Cancel</string>
</strings>

strings_de.xml:

You address an XML element by "idref", pointing into ANY document structure. If you use the "target" attribute, the translate.xsl will find and translate the XML-attribute of the addressed element with that name, or one of its sub-elements with that name (first level only, mind that there must be only one with that name). The "xml:lang" attribute is only for documentation, it is not needed.

<?xml version="1.0" encoding="ISO-8859-1"?>
<translations xml:lang="de">
    <translation idref="Cancel">Abbrechen</translation>
    <!-- could also be written as <translation target="string" idref="Cancel">Abbrechen</translation> -->
</translations>

Application.java:

String neutralButtonLabel = "Cancel";
JButton button = new JButton(Language.singleton.get(neutralButtonLabel));
toolbar.add(button);

Language.java:

import java.util.*;
import java.io.*;
import java.net.URL;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.stream.*;
import javax.xml.transform.sax.*;
import org.xml.sax.*;
import org.xml.sax.helpers.*;

/**
 * Read strings.xml,
 * transform it using translate.xsl that reads strings_de.xml,
 * process the translated result and put it into some Map.
 */
public class Language
{
    public static final Language singleton = new Language("strings.xml", Locale.getDefault());

    private Properties translations = new Properties();

    private Language(String fileName, Locale locale) throws Exception    {
       parse(translate(fileName, locale));
    }

    public String get(String neutralString)   {
       return translations.getProperty(neutralString);
    }

    private void parse(byte [] xml) throws Exception    {
        SAXParserFactory factory = SAXParserFactory.newInstance();
        SAXParser saxParser = factory.newSAXParser();
        saxParser.parse(new ByteArrayInputStream(xml), new IdTextHandler());
    }

    private byte [] translate(String fileName, Localelocale) throws Exception    {
        InputStream styleSheetStream = null;
        InputStream inputStream = null;
        ByteArrayOutputStream outputStream = null;
        try    {
            // load the stylesheet
            styleSheetStream = getClass().getResourceAsStream("translate.xsl");
            StreamSource transformSource = new StreamSource(styleSheetStream);
            Transformer transformer = TransformerFactory.newInstance().newTransformer(transformSource);

            // build the translations URL and pass it as stylesheet parameter
            String baseName = fileName.substring(0, fileName.length() - ".xml".length());
            URL translationsUrl = getClass().getResource(baseName+"_"+locale.getLanguage()+".xml");
            transformer.setParameter("translations", translationsUrl);

            // transform the language-neutral XML file (= translate it)
            inputStream = getClass().getResourceAsStream(fileName);
            outputStream = new ByteArrayOutputStream();
          
            // now start XSLT processing
           transformer.transform(new StreamSource(inputStream), new StreamResult(outputStream));
           
           outputStream.close();    // flush
           return outputStream.toByteArray();
        }
        finally    {
           try    { styleSheetStream.close(); }   catch (Exception e)    {}
           try    { inputStream.close(); }    catch(Exception e)    {}
        }
    }



    // SAX callback handler that fills the translations Map
    private class IdTextHandler extends DefaultHandler
    {
       private String id;
       private String currentText;

       public void startElement(String uri,String localName, String qName, Attributes attributes)   {
          id = attributes.getValue("id");
          currentText = "";
       }

       public void characters(char[] ch,int start, int length)   {
          currentText = new String(ch, start, length);
       }

       public void endElement(String uri,String localName, String qName)   {
          if (id != null && qName.equals("string"))
             translations.setProperty(id, currentText);
       }
    }

}


translate.xsl:

<?xml version="1.0"?>

<!-- @author Fritz Ritzberger, 2006 -->

<xsl:transform
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
    <xsl:output method="xml" encoding="ISO-8859-1" indent="yes" />

    <xsl:param name="translations" />
    <xsl:variable name="translation-map" select="document($translations)" />


    <!-- The template for all elements,processing-instructions and comments -->
    <xsl:template match="node()">
        <xsl:choose>
           <xsl:when test="name()">    <!-- is element orPI -->
               <xsl:call-template name="translate">
                   <xsl:with-param name="node" select="." />
                   <xsl:with-param name="is-attribute" select="false()" />
               </xsl:call-template>
           </xsl:when>

           <xsl:otherwise>    <!-- is comment or text node-->
               <xsl:copy />    <!-- simply copy it identically-->
           </xsl:otherwise>
        </xsl:choose>
    </xsl:template>

    <!-- The template for all attribute nodes -->
    <xsl:template match="@*">
        <xsl:call-template name="translate">
           <xsl:with-param name="node" select="." />
           <xsl:with-param name="is-attribute" select="true()" />
        </xsl:call-template>
    </xsl:template>

    <!-- named templates -->

    <!-- This delegates to 'look-for-translation' when the passed node or its parent node has an 'id' attribute. -->
    <xsl:template name="translate">
        <xsl:param name="node"/>
        <xsl:param name="is-attribute" />

        <xsl:variable name="node-name" select="name($node)" />
        <xsl:variable name="node-id" select="$node/@id" />
        <xsl:variable name="parent-id" select="$node/../@id" />

        <xsl:choose>
           <xsl:when test="$node-id">
               <xsl:call-template name="look-for-translation">
                   <xsl:with-param name="node" select="$node" />
                   <xsl:with-param name="node-id" select="$node-id" />
                   <xsl:with-param name="target" select="$node-name" />
                   <xsl:with-param name="is-attribute" select="$is-attribute" />
               </xsl:call-template>
           </xsl:when>

           <xsl:when test="$parent-id and $node-name != 'id'">
               <xsl:call-template name="look-for-translation">
                   <xsl:with-param name="node" select="$node" />
                   <xsl:with-param name="node-id" select="$parent-id" />
                   <xsl:with-param name="target" select="$node-name" />
                   <xsl:with-param name="is-attribute" select="$is-attribute" />
               </xsl:call-template>
           </xsl:when>

           <xsl:otherwise>
               <xsl:call-template name="found-no-translation">
                   <xsl:with-param name="node" select="$node" />
               </xsl:call-template>
           </xsl:otherwise>
        </xsl:choose>
    </xsl:template>

    <!-- Searches a translation for the text value of passed node with passed id and nodename. -->
    <xsl:template name="look-for-translation">
        <xsl:param name="node"/>
        <xsl:param name="node-id"/>
        <xsl:param name="target"/>
        <xsl:param name="is-attribute" />

        <xsl:variable name="translation-text" select="$translation-map//translation[
              @idref = $node-id and (not(@target) or @target = $target)]" />

        <xsl:choose>
           <xsl:when test="$translation-text">
               <xsl:choose>
                   <xsl:when test="$is-attribute">    <!-- is anattribute -->
                       <xsl:attribute name="{ name($node) }">
                           <xsl:value-of select="$translation-text" />
                       </xsl:attribute>
                   </xsl:when>

                   <xsl:otherwise>    <!-- is element or subelement -->
                       <xsl:element name="{ name($node) }">
                           <!-- append attribute copies -->
                           <xsl:for-each select="$node/@*">
                               <xsl:apply-templates select="$node/@*" />
                           </xsl:for-each>

                           <!-- append translation text -->
                           <xsl:value-of select="$translation-text" />

                           <!-- append contained non-text nodes -->
                           <xsl:apply-templates select="$node/*[not(text())]" />
                       </xsl:element>
                   </xsl:otherwise>
               </xsl:choose>
           </xsl:when>

           <xsl:otherwise>
               <xsl:call-template name="found-no-translation">
                   <xsl:with-param name="node" select="$node" />
               </xsl:call-template>
           </xsl:otherwise>
        </xsl:choose>
    </xsl:template>

    <!-- Processes any node when no translation was found. -->
    <xsl:template name="found-no-translation">
        <xsl:param name="node"/>

        <xsl:copy>
           <xsl:apply-templates select="$node/@* | $node/node()" />
        </xsl:copy>
    </xsl:template>

</xsl:transform>



Java Example Source

import java.awt.*;
import javax.swing.*;

public class Main
{
    public static void main(String [] args)
        throws Exception
    {
        JFrame f = new JFrame("Language Test");
        f.getContentPane().setLayout(new FlowLayout());
        f.getContentPane().add(new JButton(Language.singleton.get("Cancel")));
        f.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
        f.setSize(200, 200);
        f.setVisible(true);
    }
}

XML Example


    Input XML (actions.xml):
       
        <actions>
            <action id="open" label="">
                <tooltip>Open A New File</tooltip>
                <icon path="images/open.gif" />
            </action>
        </actions>


    Translation XML (actions_de.xml):
       
        <translations xml:lang="de">
            <translation idref="open" target="label">Öffnen</translation>
            <translation idref="open" target="tooltip">Neue Datei öffnen</translation>
            <!-- 'target' addresses either an attribute or an sub element of the element with 'id'. -->
        </translations>

    Processing results:
   
        <actions>
            <action id="open" label="Öffnen">
                <tooltip>Neue Datei öffnen</tooltip>
                <icon path="images/open.gif" />
            </action>
        </actions>