The language policy and planning in the PRC are represented on various official central and local websites. The annual statistical analysis of Chinese words and characters used online are among the tasks, set by the authorities before the linguists. The technologies, allowing to recognize words within the character texts and to mark them as belonging to a particular part of speech in the isolating syllabic Chinese language are applied within the process of creating numerous text corpora, often accessible on the Web. Speech recognition and speech synthesis technologies based on AI are used in machine translation, on the official websites for Standard Mandarin learners, as well as in spoken corpora. Among the latter are dialect corpora created in Taiwan and Hong Kong and available online.
中国的语言政策和规划在各种官方的中央和本地网站上都有代表。当局每年对在线使用的中文单词和字符进行统计分析是当局在语言学家面前制定的任务。该技术允许识别字符文本中的单词并将其标记为隔离音节汉语中的特定语音部分,这些技术被用于创建大量文本语料库的过程中,这些语料库通常可以在网络上访问。基于人工智能的语音识别和语音合成技术被用于机器翻译,标准普通话学习者的官方网站以及语音语料库。后者是在台湾和香港创建并在线提供的方言语料库。