Meet the New Kanji Keywords

All kanji and their components have been assigned new keywords inferred from their etymology or most common usage in Japanese. This article presents the methodology used to select those keywords.

Meet the New Kanji Keywords

In Kanjiverse, all kanji are annotated with a word (or short phrase) everywhere they appear, in the kanji grid, chip tags, Glyph Graph, Formula cards, etc. Those keywords are here to help you quickly identify the sense of a kanji and discriminate between similar looking characters. Read on to find out more about the methodology used to select those keywords.

Methodology

The following methodology was painstakingly applied to assign a keyword to each of the 2136  jōyō kanji, 863 jinmeiyō kanji (used in personal names), and 998 supplementary characters that appear as components of those kanji. Each character goes through the following steps in order, its keyword is assigned as soon as the condition is met.

  1. If a character is one of the 214 official Kangxi radicals (and their variants), its radical name is used as keyword, with a preference for the Japanese name when it differs from the standard Kangxi name.

釆 is called distinguish in the Kangxi dictionary but のごめへん topped rice in Japanese, therefore the keyword topped rice is preferred.

  1. All variants of a character are assigned the same keyword.

𧘇 and 衤are radical variants of 衣 and therefore all share the same keyword clothes.

  1. If a character is not a radical but one of the 2136 official Jōyō Kanji, its keyword is selected from public sources such as wiktionary or kanjidic. In case of multiple meanings, only one is assigned, based on its most frequent usage found in Japanese words.

治 can mean rule as well as cure, but since it appears more frequently in words such as 政治 government or 治める govern, rather than 治療 (medical) treatment, the keyword subdue is assigned.

  1. If a character is included in the Jinmeiyō Kanji list (863 extra kanji that can legally by used in names in Japan), the same methodology as 3. is applied first. But if this kanji appears exclusively in personal names in modern Japanese, its etymology or current Chinese usage is preferred if it provides a more memorable keyword.

Since 娃 is not used in common words in Japanese but means baby girl or doll in Chinese, the keyword baby doll is coined.

  1. Sometimes a character is the traditional form variant (旧字体 kyūjitai) of a kanji that already exists in a simplified form (新字体 shinjitai), they are both assigned the same keyword.

黑 is the traditional form of 黒, therefore they both have the keyword black.

  1. If a character does not exist as a standalone kanji (表外字 hyōgaiji, kanji that are not in the jōyō or jinmeiyō lists) but only as a component of more complex kanji, the same methodology as 4. is applied, but more often than not, the keyword is inferred from its Chinese origins.

哥 is not used by itself in Japanese but only as the left component of 歌 song, although it is a very common 漢字 in Chinese, in words such as 哥哥 (elder) brother, therefore its meaning of elder brother is assigned to the keyword.

  1. If a component is a pictogram (stylized drawing of an object) or ideogram (symbol of an abstract idea), this etymology is used.

㞢 is a pictogram of a cattle head. 上 and 下 are ideograms of arrows pointing up and down.

  1. If a component is a standard simplified 漢字 in China (their simplified characters are different than Japan's, often going even further in the simplification), this contemporary meaning is assigned. If it is also a variant of a standard Japanese kanji, they will share the same keyword.

乡 is used in China in place of 郷, and even though it is only found as a component in Japanese kanji, 乡 is assigned the same keyword hometown as 郷 since they are both simplifications of the traditional character 鄕.

  1. If a component is an archaic form (not even used in modern Chinese as a standalone character), this older meaning is kept.

亖 is an archaic form of 四 therefore they both have the keyword four.

Limitations

Unfortunately this system has its limits and the following trade-offs had to be kept into consideration when applying the above methodology:

  • A kanji can have multiple meanings. For the sake of conciseness, the chosen keyword only covers one of them:
高 is assigned the keyword tall even though depending on the context it can also mean expensive .
  • Multiple kanji can have similar meanings. Unless they are true variants of each other, different keywords with similar meanings had to be assigned:
自 and 己 both mean (one)self, but to distinguish them from each other, oneself is assigned to the former, and self to the latter. Note that often such cognates appear together in a word of the same meaning like 自己 (one)self.
  • The current (Japanese) usage does not always match the etymology. Not only the characters' shape and meaning have evolved naturally over the centuries, but Japan's postwar reform to simplify the traditional characters (旧字体 kyūjitai) has resulted in simplifications (新字体 shinjitai) that can obfuscate the etymology of the character, for instance by swapping a component with a simpler one that looked (or sounded) similar but has nothing to do with its original meaning.
blue was simplified into its current form 青 by replacing 円 circle with 月 moon.
fat was simplified into 脂, although they might look exactly the same on your screen (that's a font issue due to unicode han unification) the left component was originally ⺼ flesh and not 月 moon.
⺼ is actually the radical form of 肉 meat which is why it appears as the semantic component in all kanji related to body parts such as 腕 arm, 胸 breast, 肩 shoulder, 肺 lungs, etc.

Conclusion

I hope this provided some clarification as to why a particular keyword was assigned to a character. You can access below different lists of kanji and their keyword ordered by frequency of usage. They are distributed under CC-BY-SA 4.0 so feel free to use them in your own projects by mentioning Kanjiverse as the source, with a link to the corresponding list:

Please note that those keyword lists are not in their definitive version and haven't been reviewed by a native speaker yet. There might still be typos, misinterpretations, or other inaccuracies. If you spot any, please report them to me using the social links below so we can improve the lists for everyone :)