All Projects → davidmogar → Cucco

davidmogar / Cucco

Licence: mit
Text normalization library for Python

Programming Languages

python
139335 projects - #7 most used programming language
language
365 projects

Projects that are alternatives of or similar to Cucco

React Native See More Inline
Show a "read more", "see more", "read less", "see less" inline with your text in React Native
Stars: ✭ 141 (-23.78%)
Mutual labels:  text
Image
PHP Image Manipulation
Stars: ✭ 12,298 (+6547.57%)
Mutual labels:  manipulation
React Native Image Marker
Add text or icon watermark to your images
Stars: ✭ 170 (-8.11%)
Mutual labels:  text
Parjs
JavaScript parser-combinator library
Stars: ✭ 145 (-21.62%)
Mutual labels:  text
East icpr
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE
Stars: ✭ 154 (-16.76%)
Mutual labels:  text
Xlnet Gen
XLNet for generating language.
Stars: ✭ 164 (-11.35%)
Mutual labels:  text
Baffle
A tiny javascript library for obfuscating and revealing text in DOM elements. 😲
Stars: ✭ 1,721 (+830.27%)
Mutual labels:  text
Spannabletextview
SpannableTextView is a custom TextView which lets you customize the styling of slice of your text or statment via Spannables, but without the hassle of having to deal directly with Spannable themselves.
Stars: ✭ 177 (-4.32%)
Mutual labels:  text
Dagon
Advanced Hash Manipulation
Stars: ✭ 155 (-16.22%)
Mutual labels:  manipulation
Git History
Quickly browse the history of a file from any git repository
Stars: ✭ 12,676 (+6751.89%)
Mutual labels:  text
Handwritten.js
Convert typed text to realistic handwriting!
Stars: ✭ 1,806 (+876.22%)
Mutual labels:  text
Aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+949.73%)
Mutual labels:  text
Pangu.py
Paranoid text spacing in Python
Stars: ✭ 163 (-11.89%)
Mutual labels:  text
Articulations Robot Demo
Stars: ✭ 145 (-21.62%)
Mutual labels:  manipulation
Text Detector
Tool which allow you to detect and translate text.
Stars: ✭ 173 (-6.49%)
Mutual labels:  text
Assignment writer
So your teacher told you to upload written assignments? Hate writing assigments? This tool will help you convert your text to handwriting ;-; https://saiteja69.github.io/Assignment_Writer/
Stars: ✭ 143 (-22.7%)
Mutual labels:  text
Wgpu glyph
A fast text renderer for wgpu (https://github.com/gfx-rs/wgpu)
Stars: ✭ 159 (-14.05%)
Mutual labels:  text
Typogenic
Signed-distance field text rendering for Unity
Stars: ✭ 183 (-1.08%)
Mutual labels:  text
Datash
Send and Receive files directly from your browser with end-to-end encryption
Stars: ✭ 178 (-3.78%)
Mutual labels:  text
Textwrap
An efficient and powerful Rust library for word wrapping text.
Stars: ✭ 164 (-11.35%)
Mutual labels:  text

================================================= cucco |Build Status| |codecov| |patreon| |gitter|

Is that... is that a cucco? Sure it is!

Cucco is here to help you to normalize those nasty texts. Removing extra white spaces is not that hard, right? What about stop words? They're no good... oh, and don't even mention emojis!

This little friend will do the hard work for you. Just set it up and let it peck all over your text.

Oh please, shut up and show me where can I grab a cucco!

The easiest way to get a cucco is by using pip:

::

$ pip install cucco

But sometimes... sometimes you want to go wild and get the biggest... No, the best!... No, THE MIGHTIEST cucco!

To do so, you may use Git. Clone the repository from Github and do it all the hard way:

::

$ git clone https://github.com/davidmogar/cucco.git
$ cd cucco
$ python setup.py install

Got it. How do I use it?

Now that you have a cucco, I'll let it give you all the details.

Cucuco, cuco cuco cucucuco, CUCCO!

-- Cucco

So true... so true...[tears falling down my face]. Just allow me to add some insight.

There are two ways of using cucco. The first one is through its CLI. You can get more info on this by executing the next command:

::

$ cucco --help

The next example code shows how to normalize a short text using cucco inside your code:

.. code:: python

from cucco import Cucco

cucco = Cucco()
print(cucco.normalize('Who let the cucco out?'))

This would apply all normalizations to the text Who let the cucco out?. The output for this normalizations would be the next one:

::

cucco

It's also possible to send a list of normalizations to apply, which will be executed in order.

.. code:: python

from cucco import Cucco

cucco = Cucco()

normalizations = [
    'remove_extra_white_spaces',
    ('replace_punctuation', {'replacement': ' '})
]

print(cucco.normalize('Who    let   the cucco out?', normalizations))

This is the output:

::

Who let the cucco out

For more information on how to use cucco you can check its website <http://cucco.io>_, which will be ready cucco-soon.

Supported languages

You never know when a cucco will learn a new trick. Currently, they can remove stop words for 50 languages. The complete list can be checked here <https://github.com/davidmogar/cucco/tree/master/cucco/data>. If you are looking for the source you can find it in this GitHub repository <https://github.com/6/stopwords-json> which uses json for the stop words files.

Can I contribute?

Are you a breeder? No? Don't worry, you can still help.

Maybe you have a good new feature to add. Maybe is not even good. It doesn't matter! It is always good to share ideas, isn't it? Just go for it! Pull requests are warmly welcomed.

Not in the mood to implement it yourself? You can still create an issue and comment about it there. Feedback is always great!

.. |Build Status| image:: https://travis-ci.org/davidmogar/cucco.svg?branch=master :target: https://travis-ci.org/davidmogar/cucco .. |codecov| image:: https://codecov.io/gh/davidmogar/cucco/branch/master/graph/badge.svg :target: https://codecov.io/gh/davidmogar/cucco .. |patreon| image:: https://img.shields.io/badge/support%20on-patreon-red.svg :target: https://www.patreon.com/davidmogar .. |gitter| image:: https://img.shields.io/gitter/room/nwjs/nw.js.svg :target: https://gitter.im/davidmogar/cucco

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].