How to read and write in Japanese

From Gentoo Wiki
Jump to:navigation Jump to:search

This guide aims at explaining how to read and write in Japanese on a non-Japanese system. Please feel free to amend it based on personal knowledge or experience.

Requirements

In order to support Japanese language and characters, a number of required tools, libraries and capabilities need to be installed on the system.

Japanese fonts

Most non-Japanese systems have no Japanese fonts installed. Whenever a user tries to enter Japanese characters from the keyboard, they will only see small rectangle boxes in place of the characters on the screen.

Japanese Menus and Environment

For those interested (perhaps for immersion based learning) in having a Japanese language based environment, in order to change menus and other materials into the Japanese language, change your profile into LANG=ja_JP.UTF-8 (in your locale as well as .bash_profile[1])

Input method

To read and write in Japanese, the first thing that is needed is a way to enter Japanese characters with the keyboard. This is done via a piece of software usually called an input method. At the moment, for the Japanese language, there are 2 such common methods: anthy and mozc.

With such a software component typing "ta" on the keyboard will input the kana into the word processor. Some simple manipulation that is relevant to the way the input method works, will permit to easily switch from the hiragana to the katakana .

In a similar way typing "nihon" will input にほん and an other simple manipulation will permit to turn this to the kanji version of this word, 日本.

Note
It is not the purpose of this guide to describe in detail how the "anthy" or "mozc" input methods work. Please refer to the documentation of these software components.

IME

On top of this users also need a way to switch from the input method normally used for the primary language to the one needed for the Japanese language. This functionality is provided by another piece of software called an IME (Input Method Editor) such as app-i18n/ibus, app-i18n/scim or app-i18n/fcitx.

Once installed, this allows users to switch from one language's input method to the Japanese input method using a key combination or using the mouse to select a relevant icon in the icon tray.

Installation

Japanese fonts

As a minimum, install the media-fonts/kochi-substitute package.

root #emerge --ask kochi-substitute

Additionally, the following packages are also available:

Note
Many Japanese fonts display a yen symbol (¥) instead of a backslash (\). To see backslashes instead, the media-fonts/noto-cjk package can be used.
Note
When multiple fonts are installed, fc-config will pick a single font for Monospace, Sans and Serif. Verify which font is used with fc-match monospace, fc-match sans and fc-match serif respectively.

Input tools

It is recommended to use IBus instead of SCIM.

Warning
Compose key will stop working when ibus engine is not xkb:*.

anthy

app-i18n/ibus-anthy

root #emerge --ask ibus-anthy

mozc

app-i18n/mozc

root #USE=ibus emerge --ask app-i18n/mozc
Note
mozc has a better reputation than anthy though anthy has been proving working fine for years now. Among pros for mozc, there are the fact that it shows the candidates instantly during typing and has a better learning system in order to present the user the best candidates depending on the user's habit and context. It also provides the ATOK mechanism for quickly change an hiragana input to katakana (e.g. F7 as default before hitting space) or to other modes.
Note
When trying to use mozc some problems might occur. After selecting mozc as an input method in the ibus-preferences graphical utility, the mozc icon in the keyboard icon tray might not be visible. Open the gnome settings --> Countries & Languages, remove all Japanese input methods (in this case "anthy") and add "Japanese(Mozc)".

Configuring

See IBus article on running IBus on login.

user $ibus-setup

In the dialog box that appears, click on the "Input method" tab and add the "japanese-anthy" or "Japanese - Mozc" method. Then return to the "General" tab and define a key combination as a keyboard short cut for switching the input method.

Following useful keybindings could be setted up for "Japanese - Mozc" method:

Preferences → General → Keymap → Keymap style → Customize…

Mode Key Command
Direct input Ctrl ` Set input mode to Hiragana
Precomposition Ctrl ` Deactivate IME
Warning
You will most likely want to convert your typing to hiragana. In anthy, change the input mode from Latin to Hiragana.

Common USE Flags

The following use flags are commonly employed[2]:
cjk - Support for Hanzi-inspired characters (containing two bytes, hence the cause of accented a's cum sans cjk environment)
nls - 'native language support' - enables other languages in interface,
immqt-bc - For Qt to manage other language inputs
immqt - conflicts with immqt-bc as of Qt3.
unicode - Standard except for cursive hebrew

Latex

Here are some additional requirements to write Latex files in Japanese.

CJK and xetex support

In order to write Japanese chunks in Latex files, add support for CJK languages and for xetex in Texlive.

This can be accomplished by adding or modifying the following lines in /etc/portage/package.use:

FILE /etc/portage/package.use/latexEnabling cjk and xetex support
app-text/texlive cjk xetex
app-text/texlive-core cjk xetex

Then reinstall the packages:

root #emerge --ask --newuse app-text/texlive app-text/texlive-core

Here is a working short LaTeX sample:

FILE japanese.tex
\documentclass{article}
\usepackage{CJKutf8}
\usepackage{color}
 
\begin{document}
 
\begin{CJK}{UTF8}{min}
\section{One simple example}
\textcolor{red}{これは赤いです。}
\\
私は日本語で書けます。
\\
But I can also write with latin characters
\end{CJK}
 
\end{document}
Note
In the \begin{CJK}{UTF8}{min} command in the example above, the third argument defines the font. Here {min} is used for Mincho, but other supported fonts are Gothic ({goth}) and Maru Gothic ({maru}).

Editor configuration

To compile and visualize the output of the sample above Texmaker or Texstudio editor needs to be configured properly.

Open Texmaker, and go to Options -> Configure Texmaker. Under the Commands tab change the following:

  • At the LaTeX line, change "latex" with "platex".
  • At the Dvipdfm line, change "divipdfm" with "dvipdfmx".

Through the Fast compile tab, choose "Latex + Dvipdfm + View PDF".

Finally go to the Editor tab, choose UTF8 encoding and deselect On the fly on the dictionary line.

See also

  • IBus — an open source input framework for Linux and Unix.
  • TeX Live — a complete TeX distribution with several programs to create professional documents.