Meet the New Kanji Keywords
All kanji and their components have been assigned new keywords inferred from their etymology or most common usage in Japanese. This article presents the methodology used to select those keywords.
In Kanjiverse, all kanji are annotated with a word (or short phrase) everywhere they appear, in the kanji grid, chip tags, Glyph Graph, Formula cards, etc. Those keywords are here to help you quickly identify the sense of a kanji and discriminate between similar looking characters. Read on to find out more about the methodology used to select those keywords.
The following methodology was painstakingly applied to assign a keyword to each of the 2136 jōyō kanji, 863 jinmeiyō kanji (used in personal names), and 998 supplementary characters that appear as components of those kanji. Each character goes through the following steps in order, its keyword is assigned as soon as the condition is met.
- If a character is one of the 214 official Kangxi radicals (and their variants), its radical name is used as keyword, with a preference for the Japanese name when it differs from the standard Kangxi name.
釆 is called
distinguishin the Kangxi dictionary but のごめへん
topped ricein Japanese, therefore the keyword
topped riceis preferred.
- All variants of a character are assigned the same keyword.
𧘇 and 衤are radical variants of 衣 and therefore all share the same keyword
- If a character is not a radical but one of the 2136 official Jōyō Kanji, its keyword is selected from public sources such as wiktionary or kanjidic. In case of multiple meanings, only one is assigned, based on its most frequent usage found in Japanese words.
治 can mean
ruleas well as
cure, but since it appears more frequently in words such as 政治
govern, rather than 治療
(medical) treatment, the keyword
- If a character is included in the Jinmeiyō Kanji list (863 extra kanji that can legally by used in names in Japan), the same methodology as 3. is applied first. But if this kanji appears exclusively in personal names in modern Japanese, its etymology or current Chinese usage is preferred if it provides a more memorable keyword.
Since 娃 is not used in common words in Japanese but means
dollin Chinese, the keyword
baby dollis coined.
- Sometimes a character is the traditional form variant (旧字体 kyūjitai) of a kanji that already exists in a simplified form (新字体 shinjitai), they are both assigned the same keyword.
黑 is the traditional form of 黒, therefore they both have the keyword
- If a character does not exist as a standalone kanji (表外字 hyōgaiji, kanji that are not in the jōyō or jinmeiyō lists) but only as a component of more complex kanji, the same methodology as 4. is applied, but more often than not, the keyword is inferred from its Chinese origins.
哥 is not used by itself in Japanese but only as the left component of 歌
song, although it is a very common 漢字 in Chinese, in words such as 哥哥
(elder) brother, therefore its meaning of
elder brotheris assigned to the keyword.
- If a component is a pictogram (stylized drawing of an object) or ideogram (symbol of an abstract idea), this etymology is used.
㞢 is a pictogram of a
cattle head. 上 and 下 are ideograms of arrows pointing
- If a component is a standard simplified 漢字 in China (their simplified characters are different than Japan's, often going even further in the simplification), this contemporary meaning is assigned. If it is also a variant of a standard Japanese kanji, they will share the same keyword.
乡 is used in China in place of 郷, and even though it is only found as a component in Japanese kanji, 乡 is assigned the same keyword
hometownas 郷 since they are both simplifications of the traditional character 鄕.
- If a component is an archaic form (not even used in modern Chinese as a standalone character), this older meaning is kept.
亖 is an archaic form of 四 therefore they both have the keyword
Unfortunately this system has its limits and the following trade-offs had to be kept into consideration when applying the above methodology:
- A kanji can have multiple meanings. For the sake of conciseness, the chosen keyword only covers one of them:
高 is assigned the keyword
talleven though depending on the context it can also mean
- Multiple kanji can have similar meanings. Unless they are true variants of each other, different keywords with similar meanings had to be assigned:
自 and 己 both mean
(one)self, but to distinguish them from each other,
oneselfis assigned to the former, and
selfto the latter. Note that often such cognates appear together in a word of the same meaning like 自己
- The current (Japanese) usage does not always match the etymology. Not only the characters' shape and meaning have evolved naturally over the centuries, but Japan's postwar reform to simplify the traditional characters (旧字体 kyūjitai) has resulted in simplifications (新字体 shinjitai) that can obfuscate the etymology of the character, for instance by swapping a component with a simpler one that looked (or sounded) similar but has nothing to do with its original meaning.
bluewas simplified into its current form 青 by replacing 円
fatwas simplified into 脂, although they might look exactly the same on your screen (that's a font issue due to unicode han unification) the left component was originally ⺼
fleshand not 月
⺼ is actually the radical form of 肉
meatwhich is why it appears as the semantic component in all kanji related to body parts such as 腕
I hope this provided some clarification as to why a particular keyword was assigned to a character. You can access below different lists of kanji and their keyword ordered by frequency of usage. They are distributed under CC-BY-SA 4.0 so feel free to use them in your own projects by mentioning Kanjiverse as the source, with a link to the corresponding list:
- All 3997 Kanji and components
- Only the 2136 Jōyō Kanji and their components
- Only the 863 Jinmeiyō Kanji and their components that are not already in the Jōyō list
- Only the 214 Radicals, their variants and components
Please note that those keyword lists are not in their definitive version and haven't been reviewed by a native speaker yet. There might still be typos, misinterpretations, or other inaccuracies. If you spot any, please report them to me using the social links below so we can improve the lists for everyone :)