prev table of contents next

5.5 Interning Strings

If your XML data contains a large number of strings with many repetitions it may be well worth interning these strings. Calling the intern() method on a String returns a string that has the same contents as this string, but is guaranteed to be from a pool of unique strings. This may reduce your memory footprint considerably. So, is there a simple method for interning all strings resulting from unmarshalling?

There is only two things to do. First, we customize a small change for the mapping of xsd:string to String, to sneak in our own method for parsing the XML string data to a Java string.

<jaxb:globalBindings>
  <jaxb:javaType name="String"
                 xmlType="xsd:string"
                 parseMethod="faststring.StringInterner.parseStringToString"/>
</jaxb:globalBindings>
The other thing is to write the class StringInterner which contains a tiny wrapper for the method parseString from DatatypeConverter:
package faststring;
import javax.xml.bind.DatatypeConverter;

public class StringInterner {
    public static String parseStringToString( String value ){
        return DatatypeConverter.parseString( value ).intern();
    }
}
Peeking at the implementation of DatatypeConverter reveals that parseString just returns its argument. But its a good strategy to go by the book and call the basic conversion except when we are prepared to do it all on our own, as in the next example.


prev table of contents next