Chinese characters

in your analog reality

And your digital

Claes Wallin (韋嘉誠) <acgg@clacke.greatsinodevelopment.com>

ACGG Vinterhack 2015

http://slides-greatsino.rhcloud.com/hanzikanji.html

The quick quick version -----------------------

A long time ago they looked like this:

Today they look like this but inbetween they looked like this:

"Small seal script" in Small seal script and Traditional

Some people used them like this: 之乎路可良
Some people used them like this: 之乎路可良 ... which was simplified by some people to look like this: シヲヂカラ
之乎路可良 ... which was simplified by some people to look like this: シヲヂカラ ... but simplified by others to look like this: しをぢから
Shift JIS --------- Then computers came and some people thought a 本 was a [0x967B](https://en.wikipedia.org/wiki/Shift_JIS),
GB-2312 ------- Then computers came and some people thought a 本 was a [0x967B](https://en.wikipedia.org/wiki/Shift_JIS), others thought it was a [0x313E](https://en.wikipedia.org/wiki/GB_2312),
Big5 ---- Then computers came and some people thought a 本 was a [0x967B](https://en.wikipedia.org/wiki/Shift_JIS), others thought it was a [0x313E](https://en.wikipedia.org/wiki/GB_2312), still others figured it looked like a [0xA5BB](https://en.wikipedia.org/wiki/Big5),
EUC-JP ------ Then computers came and some people thought a 本 was a [0x967B](https://en.wikipedia.org/wiki/Shift_JIS), others thought it was a [0x313E](https://en.wikipedia.org/wiki/GB_2312), still others figured it looked like a [0xA5BB](https://en.wikipedia.org/wiki/Big5), or maybe a [0xCBDC](https://en.wikipedia.org/wiki/Extended_Unix_Code#EUC-JP),
ISO-2022-JP ----------- Then computers came and some people thought a 本 was a [0x967B](https://en.wikipedia.org/wiki/Shift_JIS), others thought it was a [0x313E](https://en.wikipedia.org/wiki/GB_2312), still others figured it looked like a [0xA5BB](https://en.wikipedia.org/wiki/Big5), or maybe a [0xCBDC](https://en.wikipedia.org/wiki/Extended_Unix_Code#EUC-JP), unless you went for [0x4B5C](https://en.wikipedia.org/wiki/ISO/IEC_2022#ISO-2022-JP) (same thing really),
Unicode / ISO 10646 ------------------- Then computers came and some people thought a 本 was a [0x967B](https://en.wikipedia.org/wiki/Shift_JIS), others thought it was a [0x313E](https://en.wikipedia.org/wiki/GB_2312), still others figured it looked like a [0xA5BB](https://en.wikipedia.org/wiki/Big5), or maybe a [0xCBDC](https://en.wikipedia.org/wiki/Extended_Unix_Code#EUC-JP), unless you went for [0x4B5C](https://en.wikipedia.org/wiki/ISO/IEC_2022#ISO-2022-JP) (same thing really), but today all sane people agree it's really a [U+672C](https://en.wikipedia.org/wiki/Unicode).
UTF-8 ----- Then computers came and some people thought a 本 was a [0x967B](https://en.wikipedia.org/wiki/Shift_JIS), others thought it was a [0x313E](https://en.wikipedia.org/wiki/GB_2312), still others figured it looked like a [0xA5BB](https://en.wikipedia.org/wiki/Big5), or maybe a [0xCBDC](https://en.wikipedia.org/wiki/Extended_Unix_Code#EUC-JP), unless you went for [0x4B5C](https://en.wikipedia.org/wiki/ISO/IEC_2022#ISO-2022-JP) (same thing really), but today all sane people agree it's really a [U+672C](https://en.wikipedia.org/wiki/Unicode). Which sometimes makes more sense to spell [0xE69CAC](https://en.wikipedia.org/wiki/UTF-8).
Quick history – analog ---------------------------- * 1600 BCE – [Shang dynasty](http://enwp.org/Shang_dynasty) – [Oracle bone script](http://enwp.org/Oracle_bone_script) * 500 BCE – [Spring & Autumn](http://enwp.org/Spring_and_Autumn_period), [Qin](http://enwp.org/Qin_%28state%29) – [Small seal script](http://enwp.org/Small_Seal_Script) * 500 CE – [Southern & Northern dynasties](http://enwp.org/Southern_and_Northern_Dynasties) – **[Traditional](http://enwp.org/Traditional Chinese characters)** * 650 CE – [Asuka period](http://enwp.org/Asuka_period) (Japan) – [Manyōgana](http://enwp.org/Man'yōgana) * 800 CE – [Heian period](http://enwp.org/Heian_period) (Japan) – [Hiragana](http://enwp.org/Hiragana), [Katakana](http://enwp.org/Katakana) * 800 CE – Heian period (Japan) – [Kanbun](http://enwp.org/Kanbun) ([kunyomi](http://enwp.org/Kanji#Kun.27yomi_.28Japanese_reading.29)) * 1600 CE – [Written vernacular Chinese](http://enwp.org/Written_vernacular_Chinese) * 1900 CE – **[Japanese script standardization](http://enwp.org/Japanese_script_reform#Pre-War_reforms)** (first attempt) * 1915—20 CE – [Modern Written Chinese](http://enwp.org/Standard_Chinese) * 1946 CE – [Tōyō kanji](http://enwp.org/Tōyō_kanji) (1850 characters) * 1956 CE – **[Simplified](http://enwp.org/Simplified_Chinese_characters)** * 1981 CE – **[Jōyō kanji](http://enwp.org/Jōyō_kanji)** (1945 characters)
Quick history – digital ----------------------------- * 1978 CE – [JIS C 6226](http://enwp.org/JIS_X_0208) (6802 characters) * 1981 CE – [GB 2312](http://enwp.org/GB_2312) * 1984 CE – [Big5](http://enwp.org/Big5) * 1987 CE – [JIS X 0208](http://enwp.org/JIS_X_0208) * 1991 CE – [Unicode](http://enwp.org/Unicode) 1.0 (Han unification) * 1993 CE – [ISO/IEC 10646](http://enwp.org/Universal_Coded_Character_Set), Unicode 1.1 * 1999 CE – [HKSCS](http://enwp.org/Hong_Kong_Supplementary_Character_Set) * 2000 CE – Unicode 3.2 (Variation selectors) * 2004 CE – [JIS X 0213](http://enwp.org/JIS_X_0213)

500 BCE – Qin – Small seal script

"Small seal script" in Small seal script and Traditional

(Public Domain by Gsklee at en Wikipedia)

The first reduction and standardization

The first emperor of China in front of yet another "sword" hanzi

"There are 19 different ways to write it"

### Simplified * PRC * Singapore ### Traditional * ROC * Hong Kong * Macau
### Japanese * Yeah ### Korean * Gone, with exceptions * Coming back \[citation needed] ### Vietnamese * Nah
Pictogram? ---------- [木](http://cojak.org/index.php?function=character_lookup&term=木) --
Pictogram? ---------- [本](http://cojak.org/index.php?function=character_lookup&term=本) --
Pictogram? ---------- [人](http://cojak.org/index.php?function=character_lookup&term=人) --
Pictogram? ---------- [中](http://cojak.org/index.php?function=character_lookup&term=中) --
Pictogram? ---------- [仲](http://cojak.org/index.php?function=character_lookup&term=仲) --
Pictogram? ---------- [力](http://cojak.org/index.php?function=character_lookup&term=力) --
Pictogram? ---------- [加](http://cojak.org/index.php?function=character_lookup&term=加) --
Pictogram? ---------- [非](http://cojak.org/index.php?function=character_lookup&term=非) --
Pictogram? ---------- [咖](http://cojak.org/index.php?function=character_lookup&term=咖)[啡](http://cojak.org/index.php?function=character_lookup&term=啡) ----
Pictogram? ---------- [嘉](http://cojak.org/index.php?function=character_lookup&term=嘉) -- [䕒](http://cojak.org/index.php?function=character_lookup&term=䕒) --
Pictogram? ---------- [吾](http://cojak.org/index.php?function=character_lookup&term=吾) -- [唔](http://cojak.org/index.php?function=character_lookup&term=唔) --
Universal? ---------- 是不是他們的? -------------- Shì búshì tāmende? Si6 bat1si6 taa1mun4dik1? [係唔係佢哋嘅?](https://en.wikipedia.org/wiki/Written_Cantonese#Vocabulary) -------------- Hai6 ng4hai6 keui5dei6ge3?
Written Cantonese ----------------- * Court/police transcripts * Opera scripts * SMS/chat * Ads/restaurant menus
### Kunyomi 東京 [higashimiyako](http://enwp.org/Tokyo) 京都 [miyakosubete](http://enwp.org/Kyoto) ### Onyomi 大阪 [taihan](http://enwp.org/Osaka)
Chinese writing reform ---------------------- * [Zhuyin/Bopomofo](http://enwp.org/Bopomofo) (1910) * Simplified (1956) * [Hanyu Pinyin](http://enwp.org/Pinyin) (1958)
Chinese writing reform ---------------------- ### Pro * Education * Cost
Chinese writing reform ---------------------- ### Con * Inertia * [Lion-Eating Poet in the Stone Den](https://en.wikipedia.org/wiki/Lion-Eating_Poet_in_the_Stone_Den)
Chinese writing reform ---------------------- ### Pro * [Lion-Eating Poet in the Stone Den](https://en.wikipedia.org/wiki/Lion-Eating_Poet_in_the_Stone_Den)

The first reduction and standardization

The first emperor of China in front of yet another "sword" hanzi

"There are 19 different ways to write it"