All Projects → Naereen → generate-word-cloud.py

Naereen / generate-word-cloud.py

Licence: GPL-3.0 license
🐍 A simple Python (2 or 3) script to generate a PNG word-cloud ☁️ image from a bunch of 📂 text files 🎉. Based on word_cloud by @amueller.

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to generate-word-cloud.py

pystyle
The source of my Python library, pystyle.
Stars: ✭ 158 (+731.58%)
Mutual labels:  pypi
aiotinydb
asyncio compatibility shim for tinydb
Stars: ✭ 42 (+121.05%)
Mutual labels:  pypi
thanker
Don't be a wanker, be a thanker! Automatically give thanks to Pypi packages you use in your project.
Stars: ✭ 25 (+31.58%)
Mutual labels:  pypi
pip-download
A wrapper for pip download in offline scenario.
Stars: ✭ 22 (+15.79%)
Mutual labels:  pypi
poetry-version-plugin
Poetry plugin for dynamically extracting the package version from a __version__ variable or a Git tag.
Stars: ✭ 253 (+1231.58%)
Mutual labels:  pypi
publishing-python-packages
Examples and exercises for Publishing Python Packages from Manning Books 🐍 📦 ⬆️
Stars: ✭ 25 (+31.58%)
Mutual labels:  pypi
python-lib
Opinionated cookiecutter template for creating a new Python library
Stars: ✭ 75 (+294.74%)
Mutual labels:  pypi
duckpy
A simple Python library for searching on DuckDuckGo.
Stars: ✭ 20 (+5.26%)
Mutual labels:  pypi
Pyaiodl
A python Asynchronous Downloader - Pyaiodl
Stars: ✭ 40 (+110.53%)
Mutual labels:  pypi
django-materializecss-form
Materializecss for Django Form
Stars: ✭ 83 (+336.84%)
Mutual labels:  pypi
psm
Pypi Source Manager: fast switch between different Pypi Source: Pypi, double, aliyun
Stars: ✭ 31 (+63.16%)
Mutual labels:  pypi
feupy
The sigarra scraping library no one asked for
Stars: ✭ 13 (-31.58%)
Mutual labels:  pypi
badge
Badges for your site to display cool badges for your projects such as downloads, license, status, ...
Stars: ✭ 14 (-26.32%)
Mutual labels:  pypi
pipx
Install and Run Python Applications in Isolated Environments
Stars: ✭ 5,698 (+29889.47%)
Mutual labels:  pypi
pycayennelpp
A Cayenne Low Power Payload (CayenneLPP) decoder and encoder for Python
Stars: ✭ 17 (-10.53%)
Mutual labels:  pypi
ocflib
Python libraries for account and server management
Stars: ✭ 13 (-31.58%)
Mutual labels:  pypi
mojang
A wrapper for the Mojang API and Minecraft website
Stars: ✭ 19 (+0%)
Mutual labels:  pypi
copulae
Multivariate data modelling with Copulas in Python
Stars: ✭ 96 (+405.26%)
Mutual labels:  pypi
lit-ncov-report
洛阳理工学院 "健康状况管控平台" , 非官方Python封装库兼CLI工具与拓展实现
Stars: ✭ 41 (+115.79%)
Mutual labels:  pypi
pipyuan
pipyuan 内置了国内常用的 pip 源, 你可以快速设置想要的源
Stars: ✭ 30 (+57.89%)
Mutual labels:  pypi

generate-word-cloud.py

A simple Python 🐍 script to generate a square wordcloud ☁️ from one (or more) text file(s). Supporting both Python 2 and 3 (2.7+ and 3.4+). generatewordcloud in pypi

generate-word-cloud example meta

Based on the great word_cloud module by @amueller.

PyPI version PyPI license PyPI format PyPI pyversions PyPI implementation PyPI status


How to use it?

1. Requirements

The usual module matplotlib is needed for the plotting, docopt is needed for the command line interface, and word_cloud is needed for the actual work (generating the cloud of words after reading the files).

The required Python (2 or 3) modules can be installed with pip, either directly:

# Directly:
sudo pip install matplotlib docopt word_cloud

Or with the requirements.txt file:

sudo pip install -r requirements.txt

Note: if ansicolortags is available, it will be used to print nice colors in the help and during the generation of word clouds.

2. Installation

Clone the repository, copy the script (generate-word-cloud.py) somewhere in your PATH (e.g., ~/.local/bin/).

You can also just download the script itself:

$ wget https://raw.githubusercontent.com/Naereen/generate-word-cloud.py/master/generate-word-cloud.py
$ cp generate-word-cloud.py /path/to/a/directory/in/your/PATH/

Note: The script is also available from PyPI : pypi.python.org/pypi/generatewordcloud. You can install it using pip.

$ pip install generatewordcloud
$ # Or maybe you need sudo rights:
$ sudo pip install generatewordcloud

PyPI version PyPI license PyPI format PyPI pyversions PyPI implementation PyPI status


3. Usage

Help:

$ generate-word-cloud.py --help

From one or two files

Generate a wordcloud from two txt files in the current directory, save it to wordcloud_txt.png.

$ generate-word-cloud.py -o ./wordcloud_txt.png ./file1.txt ./file2.txt

Generate a wordcloud from the textfile hamlet.txt (~ 8000 lines), saving to hamlet.png:

$ generate-word-cloud.py -o ./hamlet.png ./hamlet.txt

generate-word-cloud example hamlet

(It should work on pretty big text files without any issue.)


Other examples

From a lot of Python scripts (~ 200) 🐍

generate-word-cloud example python

From a lot of Bash scripts (~ 150) 🐚

generate-word-cloud example bash

From a lot of LaTeX files (~ 180) 🍆

generate-word-cloud example LaTeX

🎨 Meta example

Generate a wordcloud from the README.md and generate-word-cloud.py files of this very project, save it to wordcloud_meta.png!

$ generate-word-cloud.py -o ./wordcloud_meta.png ./*.md ./*.py

generate-word-cloud example meta


Features

  • Support one or more input file(s), will cleanly skip any file it fails to find or fails to read,
  • Custom output file, won't be overwritten (except with -f flag),
  • Nice command line interface (argparse powered). I switched to docopt after realizing how awesome it is!
  • Has a command line option for every important parameter (max nb of words, width, height etc).
  • Input filenames with spaces in their name were seen as several files (e.g. this file.txt), FIXED with the switch to docopt.

📃 Complete documentation (--help)

$ generate-word-cloud.py -h | --help
Usage:
  generate-word-cloud.py [-s | --show] [-f | --force] [-o OUTFILE | --outfile=OUTFILE]
                         [-t TITLE | --title=TITLE] [-m MAX | --max=MAX]
                         [-w WIDTH | --width=WIDTH] [-H HEIGHT | --height=HEIGHT]
                         INFILE...
  generate-word-cloud.py (-h | --help)
  generate-word-cloud.py (-v | --version)

Options:
  -h --help            Show this help message and exit.
  -v --version         Show program's version number and exit.
  -s --show            Show the image but do not save it [default False].
  -f --force           Force to write the image, even if present (default is to ask before overwriting an existing file) [default False].
  -o OUTFILE --outfile=OUTFILE
                       Filename for the generated image [default 'wordcloud.png'].
  -t TITLE --title=TITLE
                       Title for the image [default None].
  -m MAX --max MAX
                       Max number of words to display on the cloud word [default 150].
  -w WIDTH --width WIDTH
                       Width of the generate image [default 400].
  -H HEIGHT --height HEIGHT
                       Height of the generate image [default 300].
  INFILE               A text file to read.

📝 TODO

  • Start it, from this example,
  • Run it on some interesting examples, embed them here (as images),
  • Check on weird encodings? (i.e., not UTF-8). It works fine!
  • Test it against 📕 VERY large files (millions of lines) ? It works fine, slowly but fine.
  • Test it against 📚 LOTS of files (several thousands) ? It works fine, slowly but fine.
  • Publish it on PyPI: it is available at pypi.python.org/pypi/generatewordcloud/
  • Write a small article about it for my blog.

🐛 Knows issues

  • Only tested on (X)Ubuntu (15.10), but it should work on other GNU/Linux distribution and Mac OS X (and probably Windows), if they support docopt and has both docopt and word_cloud installed.

🐛 Unknown issues?

Use the issue tracker to notify me of a bug!


About

Why write this script?

There already is a lot of good cloud word generator online, e.g. wordle.net.

  1. I wanted a way to visualize the major keywords of Bash and Python (my two favorite programming languages) and of Markdown/Strapdown, reStructuredText and LaTeX (my favorite typeset documents system),
  2. The original project word_cloud seemed cool. And it is. Great job @amueller 👏 !
  3. Clouds of words are interesting! And Python is awesome!

Author

Lilian Besson (Naereen).

📜 License ? GitHub license

This plug-in is published under the terms of the GPLv3 License (file LICENSE), © Lilian Besson, 2016.

Maintenance Ask Me Anything ! Analytics made-with-python

ForTheBadge uses-badges ForTheBadge uses-git

ForTheBadge built-with-love

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].