All Projects → shangjingbo1226 → Chinesewordcloud

shangjingbo1226 / Chinesewordcloud

Licence: apache-2.0
Word Cloud for Chinese Text Corpus (中文词云制作)

Programming Languages

python
139335 projects - #7 most used programming language

ChineseWordCloud

Example Results

If we use some transcripts of movies (i.e., example.txt), we can get the following word clouds.

alt text alt text

Ideas

It creates the word cloud for Chinese text corpus. The layout and color of the word cloud is fit to the background templates (accept .png and .jpg files).

First, the input text is tokenized into different words. Second, stopwods will be filtered before we count the frequency. In the end, word clouds based on different background templates are generated. The output files are generated and named based on the templage names.

File structure

|-ChineseWordCloud
  |-create_word_cloud.py
  |-data
    |-stopwords.txt
    |-templates
      |-love.png
      |-color_love.png
  |-fonts
    |-STFangSong.ttf

Usage

Install the required packages through pip

pip install -r requirements.txt

The input file can be specified in the command line as follows.

python create_word_cloud.py example.txt

The default output will be

love_example.png
color_love_example.png

Advanced Usages

More Backgraound Templates

Please put your new pictures under ./data/templates/. Both png and jpeg files are accepted.

Different Fonts

Please modify the font_filename in the python script.

Notes

This is a part of 2017 birthday gifts to my wife :).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].