Meet the New Kanji Keywords
All kanji and their components have been assigned new keywords inferred from their etymology or most common usage in Japanese. This article presents the methodology used to select those keywords.
In Kanjiverse, all kanji are annotated with a word (or short phrase) everywhere they appear, in the kanji grid, chip tags, Glyph Graph, Formula cards, etc. Those keywords are here to help you quickly identify the sense of a kanji and discriminate between similar looking characters. Read on to find out more about the methodology used to select those keywords.
Methodology
The following methodology was painstakingly applied to assign a keyword to each of the 2136 jōyō kanji, 863 jinmeiyō kanji (used in personal names), and 998 supplementary characters that appear as components of those kanji. Each character goes through the following steps in order, its keyword is assigned as soon as the condition is met.
- If a character is one of the 214 official Kangxi radicals (and their variants), its radical name is used as keyword, with a preference for the Japanese name when it differs from the standard Kangxi name.
釆 is called
distinguish
in the Kangxi dictionary but のごめへんtopped rice
in Japanese, therefore the keywordtopped rice
is preferred.
- All variants of a character are assigned the same keyword.
𧘇 and 衤are radical variants of 衣 and therefore all share the same keyword
clothes
.
- If a character is not a radical but one of the 2136 official Jōyō Kanji, its keyword is selected from public sources such as wiktionary or kanjidic. In case of multiple meanings, only one is assigned, based on its most frequent usage found in Japanese words.
治 can mean
rule
as well ascure
, but since it appears more frequently in words such as 政治government
or 治めるgovern
, rather than 治療(medical) treatment
, the keywordsubdue
is assigned.
- If a character is included in the Jinmeiyō Kanji list (863 extra kanji that can legally by used in names in Japan), the same methodology as 3. is applied first. But if this kanji appears exclusively in personal names in modern Japanese, its etymology or current Chinese usage is preferred if it provides a more memorable keyword.
Since 娃 is not used in common words in Japanese but means
baby girl
ordoll
in Chinese, the keywordbaby doll
is coined.
- Sometimes a character is the traditional form variant (旧字体 kyūjitai) of a kanji that already exists in a simplified form (新字体 shinjitai), they are both assigned the same keyword.
黑 is the traditional form of 黒, therefore they both have the keyword
black
.
- If a character does not exist as a standalone kanji (表外字 hyōgaiji, kanji that are not in the jōyō or jinmeiyō lists) but only as a component of more complex kanji, the same methodology as 4. is applied, but more often than not, the keyword is inferred from its Chinese origins.
哥 is not used by itself in Japanese but only as the left component of 歌
song
, although it is a very common 漢字 in Chinese, in words such as 哥哥(elder) brother
, therefore its meaning ofelder brother
is assigned to the keyword.
- If a component is a pictogram (stylized drawing of an object) or ideogram (symbol of an abstract idea), this etymology is used.
㞢 is a pictogram of a
cattle head
. 上 and 下 are ideograms of arrows pointingup
anddown
.
- If a component is a standard simplified 漢字 in China (their simplified characters are different than Japan's, often going even further in the simplification), this contemporary meaning is assigned. If it is also a variant of a standard Japanese kanji, they will share the same keyword.
乡 is used in China in place of 郷, and even though it is only found as a component in Japanese kanji, 乡 is assigned the same keyword
hometown
as 郷 since they are both simplifications of the traditional character 鄕.
- If a component is an archaic form (not even used in modern Chinese as a standalone character), this older meaning is kept.
亖 is an archaic form of 四 therefore they both have the keyword
four
.
Limitations
Unfortunately this system has its limits and the following trade-offs had to be kept into consideration when applying the above methodology:
- A kanji can have multiple meanings. For the sake of conciseness, the chosen keyword only covers one of them:
高 is assigned the keywordtall
even though depending on the context it can also meanexpensive
.
- Multiple kanji can have similar meanings. Unless they are true variants of each other, different keywords with similar meanings had to be assigned:
自 and 己 both mean(one)self
, but to distinguish them from each other,oneself
is assigned to the former, andself
to the latter. Note that often such cognates appear together in a word of the same meaning like 自己(one)self
.
- The current (Japanese) usage does not always match the etymology. Not only the characters' shape and meaning have evolved naturally over the centuries, but Japan's postwar reform to simplify the traditional characters (旧字体 kyūjitai) has resulted in simplifications (新字体 shinjitai) that can obfuscate the etymology of the character, for instance by swapping a component with a simpler one that looked (or sounded) similar but has nothing to do with its original meaning.
靑blue
was simplified into its current form 青 by replacing 円circle
with 月moon
.
脂fat
was simplified into 脂, although they might look exactly the same on your screen (that's a font issue due to unicode han unification) the left component was originally ⺼flesh
and not 月moon
.
⺼ is actually the radical form of 肉meat
which is why it appears as the semantic component in all kanji related to body parts such as 腕arm
, 胸breast
, 肩shoulder
, 肺lungs
, etc.
Conclusion
I hope this provided some clarification as to why a particular keyword was assigned to a character. You can access below different lists of kanji and their keyword ordered by frequency of usage. They are distributed under CC-BY-SA 4.0 so feel free to use them in your own projects by mentioning Kanjiverse as the source, with a link to the corresponding list:
- All 3997 Kanji and components
- Only the 2136 Jōyō Kanji and their components
- Only the 863 Jinmeiyō Kanji and their components that are not already in the Jōyō list
- Only the 214 Radicals, their variants and components
Please note that those keyword lists are not in their definitive version and haven't been reviewed by a native speaker yet. There might still be typos, misinterpretations, or other inaccuracies. If you spot any, please report them to me using the social links below so we can improve the lists for everyone :)