Easy to Learn Java: Programming Articles, Examples and Tips

Start with Java in a few days with Java Lessons or Lectures


Code Examples

Java Tools

More Java Tools!

Java Forum

All Java Tips


Submit News
Search the site here...
Search the JavaFAQ.nu
1000 Java Tips ebook

1000 Java Tips - Click here for the high resolution copy!1000 Java Tips - Click here for the high resolution copy!

Java Screensaver, take it here

Free "1000 Java Tips" eBook is here! It is huge collection of big and small Java programming articles and tips. Please take your copy here.

Take your copy of free "Java Technology Screensaver"!.

Using Unicode Variable Names

JavaFAQ Home » Story by Dr. Kabutz Go to all tips in Story by Dr. Kabutz

Bookmark and Share

The Java Specialists' Newsletter [Issue 036] - Using Unicode Variable Names

Author: Dr. Heinz M. Kabutz

JDK version:

Category: Language

You can subscribe from our home page: http://www.javaspecialists.co.za (which also hosts all previous issues, available free of charge Smile

Welcome to the 36th edition of "The Java(tm) Specialists' Newsletter". This week, we will look at the strange things that happen when we try to use unicode characters in our code.

I am sitting outside in my garden, with beautiful sunshine and a pitbull terrier at my command Wink Approximately a month ago, the biggest software vendor in South Africa went bankrupt, severely affecting the availability of software in this country. Fortunately for me, I have friends in convenient places: I purchased the software that I needed (Dragon NaturallySpeaking) from Amazon in Germany and had it shipped to infor AG, who I have spoken about in other newsletters - they very kindly shipped it down to the end of the earth.

As a result of using Dragon NaturallySpeaking, you will probably notice that my newsletters will have an even more conversational style than before. I am always looking at ways in which I can improve my newsletters and serve you better. Please remember to forward this newsletter to friends and colleagues who are interested in Java.

A special welcome to country No 56, Malta! My wife's previous boss at a hotel was the Maltese ambassador for Cape Town, which was really cool, as he had diplomatic immunity from parking fines and speeding fines. Mind you, traffic laws are rather lax in this country, I have only had one speeding fine in my life, and I drive an Alfa Romeo!

South Africa has just become the cheapest country in the world! We are the first country where a Big Mac costs less than US$ 1. It is cheaper here even than in the Philipines and China. I had a good response to my advert for my Java Course (thank you for your patience in this regard) and so I definitely want to develop the idea of running courses in South Africa, combined with a holiday Smile

How do you go from being an OO beginner to an OO guru? Simple answer: Experience! But what if you can't wait 10 years to get that experience? Simple answer: Design Patterns! How can you learn Design Patterns in a relaxed setting from someone who has used them in the real world? Simple answer: Ask about my new course "Design Patterns - The Timeless Way of Coding".

1707 members are currently subscribed from 56 countries

Using Unicode Variable Names

A few months ago, I was reading a book written by the authors of Java, when I stumbled across a piece of code that was using Unicode characters as variable names. Being the curious type, I immediately tried writing a piece of code that used funny characters. Easier said than done! I don't know of any Java IDE that supports Unicode. The common e-mail systems in this world would also choke like a dog on a chicken bone if I sent you a newsletter containing Unicode characters Wink

Before I get into how we could use Unicode characters in our variables, let's just take a step back and think about it: Imagine being called in by a Japanese company who has got a memory leak in their program which they want you to fix (one of the most common tasks I have been asked to perform), and imagine if in their company they used Japanese characters for their variables. Yes, it would compile if you follow the ideas in this newsletter, but what would the result be for me? I would probably pack my bags and head back home! It's bad enough having to read code where the variable names are in German or in Afrikaans, I cannot imagine trying to understand code where I don't even know the characters used in variable names!

Since I could not find an IDE that supported Unicode, my first job was to write a Unicode editor. Also easier said than done. I had learned many years ago that Writers and Readers are used for Unicode characters, but I had never really used Unicode before. My first approach at reading and writing Unicode files looked something like this:

public void load() throws IOException {
  BufferedReader in = new BufferedReader(new FileReader(filename));
  String s;
  while((s = in.readLine()) != null) {
    // ...

Did you know that FileReader extends InputStreamReader? In its constructor it constructs a FileInputStream that it passes to its parent. The InputStreamReader has a constructor that takes as argument the encoding used for reading files. FileReader unfortunately does not expose the constructor that takes the encoding as an argument, it simply uses an operating-system dependent encoding. One cannot but wonder what the author of the FileReader had been smoking the day he/she wrote that code ...

(Actually, when I wrote the Sun Microsystems Java programmer examination a few years ago, the only none-GUI question that I got wrong was a question relating to reading ISO-8859-1 data. Perhaps there has always been a hole in my knowledge regarding this topic.)

Should you want to use the FileReader to read an encoding different to the standard one, you would have to do the following:

public void load() throws IOException {
  BufferedReader in = new BufferedReader(
    new InputStreamReader(
      new FileInputStream(filename), "UTF-16BE"));
  String s;
  while((s = in.readLine()) != null) {
    // ...

Without further ado, here is the code for a Unicode text editor. It allows you to insert Unicode characters by entering their decimal values and pressing the appropriate button. For the design, I have followed an approach I saw a few years ago on jGuru, where all the GUI elements are created lazily. It makes the GUI code very nicely maintainable, as you never have to worry in what order elements are constructed.

import java.awt.*;
import java.awt.event.*;
import javax.swing.*;
import java.io.*;

public class UnicodeEditor extends JFrame {
  private JPanel buttonPanel;
  private JScrollPane editorPanel;
  private JTextArea editor;
  private final String filename;
  private final String encoding;

  public UnicodeEditor(String filename, String encoding)
      throws IOException {
    this.filename = filename;
    this.encoding = encoding;
    getContentPane().add(getButtonPanel(), BorderLayout.NORTH);
    getContentPane().add(getEditorPanel(), BorderLayout.CENTER);

  protected JPanel getButtonPanel() {
    if (buttonPanel == null) {
      buttonPanel = new JPanel();
      JButton unicodeInsert = new JButton("Insert Unicode:");
      final JTextField unicodeField = new JTextField(8);
      JButton saveExit = new JButton("Save & Exit");
      unicodeInsert.addActionListener(new ActionListener() {
        public void actionPerformed(ActionEvent e) {
            "" + (char)Integer.parseInt(unicodeField.getText()),
      saveExit.addActionListener(new ActionListener() {
        public void actionPerformed(ActionEvent e) {
          try {
          } catch(IOException ex) { ex.printStackTrace(); }
    return buttonPanel;

  protected JTextArea getEditor() {
    if (editor == null) {
      editor = new JTextArea();
    return editor;

  protected JScrollPane getEditorPanel() {
    if (editorPanel == null) {
      editorPanel = new JScrollPane(getEditor());
    return editorPanel;

  protected void load() throws IOException {
    BufferedReader in = new BufferedReader(new InputStreamReader(
      new FileInputStream(filename), encoding));
    StringBuffer buf = new StringBuffer();
    int i;
    while((i = in.read()) != -1) buf.append((char)i);

  protected void save() throws IOException {
    BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
      new FileOutputStream(filename), encoding));
    char[] text = getEditor().getText().toCharArray();
    for (int i=0; ipublic static void main(String[] args) throws IOException {
    if (args.length < 1)
      throw new IllegalArgumentException(
        "usage: UnicodeEditor filename [encoding]");
    String encoding = (args.length == 2)?args[1]:"UTF-16BE";
      UnicodeEditor editor = new UnicodeEditor(args[0], encoding);

By default this uses the UTF-16BE format, standing for Sixteen-bit Unicode Transformation Format, big-endian byte order. You can specify any encoding when you start the editor, such as UTF-8, ISO-8859-1, etc. But, before we use this editor, we first need to have a file containing Unicode characters. I've written a code generator that generates two files, MathsSymbols.java and MathsSymbolsTest.java:

import java.io.*;
public class UnicodeVariableGenerator {
  public static void generateMathsSymbols() throws IOException {
    PrintWriter out = new PrintWriter(new OutputStreamWriter(
      new FileOutputStream("MathsSymbols.java"), "UTF-16BE"));
    out.println("public interface MathsSymbols {");
    out.print(  "  public static final double ");
    out.println(" = 3.14159265358979323846;");
    out.print(  "  public static final double ");
    out.println(" = 2.7182818284590452354;");
  public static void generateMathsSymbolsTest() throws IOException {
    PrintWriter out = new PrintWriter(new OutputStreamWriter(
      new FileOutputStream("MathsSymbolsTest.java"), "UTF-16BE"));
    out.println("public class MathsSymbolsTest implements MathsSymbols {");
    out.println("  public static void main(String args[]) {");
    out.println("    System.out.println("The value of PI is: " + u03C0);");
    out.println("    System.out.println("The value of E is: " + u03B5);");
    out.println("  }");
  public static void main(String[] args) throws IOException {

I won't include the code for MathsSymbols.java and MathsSymbolsTest.java, please run the UnicodeVariableGenerator class to generate that code. I already bomb out enough mailing systems by sending my newsletters in HTML (*evil grin*), no use in causing more trouble by using Unicode. Once you've run the UnicodeVariableGenerator, please load the MathsSymbols.java file with the UnicodeEditor, using UTF-16BE and have a look at it: you should see the Greek symbol for PI.

The last "trick" you need to know about is how to compile the MathsSymbols.java and MathsSymbolsTest.java. If you open the files with notepad or vi, you will probably see a rather strangely formatted file, with two bytes being used per character. When you compile these files, you therefore have to specify the character encoding used:

javac -encoding UTF-16BE MathsSymbols*.java

That's it! And it has kept me busy longer than just about all the other newsletters to try and get it right. Another interesting variation of this is where David Treves (who I met through a really cool advanced Java chat list - JavaDesk on YahooGroups - where you get shouted at if you ask beginner questions) tried to write/read Hebrew to the Database. He doggedly tried to get it working until eventually he succeeded - after I had given up hope of ever figuring it out. Stay tuned for the next few weeks to see how he did it.

Until next week, when we celebrate our first anniversary as the most interesting Java newsletter on the Internet Wink

Kind regards


Copyright 2000-2004 Maximum Solutions, South Africa

Reprint Rights. Copyright subsists in all the material included in this email, but you may freely share the entire email with anyone you feel may be interested, and you may reprint excerpts both online and offline provided that you acknowledge the source as follows: This material from The Java(tm) Specialists' Newsletter by Maximum Solutions (South Africa). Please contact Maximum Solutions for more information.

Java and Sun are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. Maximum Solutions is independent of Sun Microsystems, Inc.

 Printer Friendly Page  Printer Friendly Page
 Send to a Friend  Send to a Friend

.. Bookmark and Share

Search here again if you need more info!
Custom Search

Home Code Examples Java Forum All Java Tips Books Submit News, Code... Search... Offshore Software Tech Doodling

RSS feed Java FAQ RSS feed Java FAQ News     

    RSS feed Java Forums RSS feed Java Forums

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest 1999-2006 by Java FAQs Daily Tips.

Interactive software released under GNU GPL, Code Credits, Privacy Policy