The mathematics of kana vs. kanji usage over time (1879-1968)
« previous post | next post »
In the fourth comment to "Striving to revive the flagging sinographic cosmopolis" (4/26/25), I stated my observation of morphosyllabic kanji usage is that it has been declining over time at the expense of kana and other phonetic elements of the writing system, and I expressed the wish that a quantitative study of the actual usage be carried out. It turns out that we already do have this information, and it is visually evident in these graphs which were called to my attention by Jim Unger.
They are from:
Zusetsu nihongo: Gurafu de miru kotoba no sugata kotoba o hakaru keiryō kokugogaku (Kadokawa ko jiten 9), Hayashi Ōki.
図説日本語: グラフで見ることばの姿 ことばを計る計量国語学 (角川小辞典 9), 林大.
Illustrated Japanese: The appearance of words in graphs, Quantitative Japanese linguistics to measure words (Kadokawa Small Dictionaries), Hayashi Ōki, ed. and comp. (1982), pp. 276-277.
The graphs are derived from a 1969 book by Morioka Kenji on Meiji period language. Both graphs cover the years 1879-1968.
This graph shows the percentages of A words of Sinitic derivation written in kana, B words of Sinitic derivation written with kanji, C words of Japanese derivation written with kanji, and D words of Japanese derivation written in kana over the years 1879-1968. As you might expect, words in A increase very slightly. B shrinks linearly and a bit more rapidly. The ratio of C/D shrinks quite a bit, and the curve of the line dividing C from D is roughly a shallow parabola.
This graph shows the percentage of words written with kanji out of the total. The dividing line represents a sharp decline in kanji usage.
Selected readings
- "More katakana, fewer kanji" (4/4/16)
- "A plethora of katakana?" (1/13/25)
- "Kana, not kanji, for names" (1/3/21)
- "Japanese survey on forgetting how to write kanji" (9/24/12)
Andreas Johansson said,
April 29, 2025 @ 6:42 am
Cool. Would be even cooler with comparable data for today.
(For bonus cool: words of non-East Asian derivation. I assume there's hardly any English etc. loanwords written in kanji, but the proportion of such words in kana has surely been climbing.)
Chris Button said,
April 29, 2025 @ 7:12 am
Surely there are loads of more recent studies? I imagine the 1981 and 2010 expansions of the "general/common use" came after much debate.
Victor Mair said,
April 29, 2025 @ 8:00 am
@Andreas Johansson @Chris Button
Bring 'em on!
I doubt that the overall trajectories would change much, or would even increase for all the reasons that were operative during the 89 years of the Morioka Kenji study.
Victor Mair said,
April 29, 2025 @ 8:03 am
Plus some new ones, such as revolutionary developments in digital and electronic devices.
J.W. Brewer said,
April 29, 2025 @ 11:59 am
Further to Andreas Johansson's point, it would be useful if the "kana" statistics were subdivided between the two leading brands of kana, since if the growth is all in katakana that's not inconsistent with increased usage of new non-Sinitic loanwords, whereas any increasing use of hiragana to write specific established lexemes previously written in kanji would be a significantly more remarkable/striking phenomenon.
KIRINPUTRA said,
April 29, 2025 @ 7:15 pm
"奈良" would be counted as "kanji"? But if somebody wrote it as "ナラ", that would be "non-kanji"?
W/o this distinction, all kana would just be kanji. But how is this distinction justified? Isn't ナ a non-cosmopolitan form of 奈, just as 达 is a non-cosmopolitan form of 達?
It seems subjective to say that ナ is a non-kanji b/c it is expressly for phonetic use, while 奈 is a "kanji" when used phonetically. I'm sure the split makes sense subjectively to any number of people, but is there any objective basis for it?
Philip Spaelti said,
April 30, 2025 @ 11:43 pm
@KIRINPUTRA: Your question seems to suggest that you have no idea how Japanese writing works.
The Kana syllabaries are syllabaries. The kana may have their origin in Kanji , but in the modern script they are only phonetic elements.
ナ cannot be substituted for the Kanji 奈, it can only be substituted for the phonetic syllable "na", usually only in well prescribed contexts.
Chris Button said,
May 1, 2025 @ 4:22 pm
Presumably the counting is based on the use of at least one kanji in a word. So 見る counts as a word written with kanji (despite the る), while みる counts as a word written in kana.