All Projects → clips → Pattern

clips / Pattern

Licence: bsd-3-clause
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Programming Languages

python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Pattern

Dostoevsky
Sentiment analysis library for russian language
Stars: ✭ 191 (-97.65%)
Mutual labels:  natural-language-processing, sentiment-analysis
Languagecrunch
LanguageCrunch NLP server docker image
Stars: ✭ 281 (-96.54%)
Mutual labels:  natural-language-processing, sentiment-analysis
Shifterator
Interpretable data visualizations for understanding how texts differ at the word level
Stars: ✭ 209 (-97.42%)
Mutual labels:  natural-language-processing, sentiment-analysis
Stocksight
Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
Stars: ✭ 1,037 (-87.22%)
Mutual labels:  natural-language-processing, sentiment-analysis
Pythoncode Tutorials
The Python Code Tutorials
Stars: ✭ 544 (-93.29%)
Mutual labels:  natural-language-processing, network-analysis
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (-68.96%)
Mutual labels:  natural-language-processing, sentiment-analysis
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (-60.44%)
Mutual labels:  natural-language-processing, sentiment-analysis
Absapapers
Worth-reading papers and related awesome resources on aspect-based sentiment analysis (ABSA). 值得一读的方面级情感分析论文与相关资源集合
Stars: ✭ 142 (-98.25%)
Mutual labels:  natural-language-processing, sentiment-analysis
Nlp.js
An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
Stars: ✭ 4,670 (-42.43%)
Mutual labels:  natural-language-processing, sentiment-analysis
Text mining resources
Resources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (-95.59%)
Mutual labels:  natural-language-processing, sentiment-analysis
Nlp bahasa resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (-98.05%)
Mutual labels:  natural-language-processing, sentiment-analysis
Nlp With Ruby
Curated List: Practical Natural Language Processing done in Ruby
Stars: ✭ 907 (-88.82%)
Mutual labels:  natural-language-processing, sentiment-analysis
Char Cnn Text Classification Pytorch
Character-level Convolutional Neural Networks for text classification in PyTorch
Stars: ✭ 147 (-98.19%)
Mutual labels:  natural-language-processing, sentiment-analysis
Multimodal Sentiment Analysis
Attention-based multimodal fusion for sentiment analysis
Stars: ✭ 172 (-97.88%)
Mutual labels:  natural-language-processing, sentiment-analysis
Googlelanguager
R client for the Google Translation API, Google Cloud Natural Language API and Google Cloud Speech API
Stars: ✭ 145 (-98.21%)
Mutual labels:  natural-language-processing, sentiment-analysis
Malaya
Natural Language Toolkit for bahasa Malaysia, https://malaya.readthedocs.io/
Stars: ✭ 239 (-97.05%)
Mutual labels:  natural-language-processing, sentiment-analysis
Nlp Papers
Papers and Book to look at when starting NLP 📚
Stars: ✭ 111 (-98.63%)
Mutual labels:  natural-language-processing, sentiment-analysis
Awesome Ai Services
An overview of the AI-as-a-service landscape
Stars: ✭ 133 (-98.36%)
Mutual labels:  natural-language-processing, sentiment-analysis
Aspect Based Sentiment Analysis
A paper list for aspect based sentiment analysis.
Stars: ✭ 311 (-96.17%)
Mutual labels:  natural-language-processing, sentiment-analysis
Conv Emotion
This repo contains implementation of different architectures for emotion recognition in conversations.
Stars: ✭ 646 (-92.04%)
Mutual labels:  natural-language-processing, sentiment-analysis

Pattern

Build Status Coverage PyPi version License

Pattern is a web mining module for Python. It has tools for:

  • Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM parser
  • Natural Language Processing: part-of-speech taggers, n-gram search, sentiment analysis, WordNet
  • Machine Learning: vector space model, clustering, classification (KNN, SVM, Perceptron)
  • Network Analysis: graph centrality and visualization.

It is well documented, thoroughly tested with 350+ unit tests and comes bundled with 50+ examples. The source code is licensed under BSD.

Example workflow

Example

This example trains a classifier on adjectives mined from Twitter using Python 3. First, tweets that contain hashtag #win or #fail are collected. For example: "$20 tip off a sweet little old lady today #win". The word part-of-speech tags are then parsed, keeping only adjectives. Each tweet is transformed to a vector, a dictionary of adjective → count items, labeled WIN or FAIL. The classifier uses the vectors to learn which other tweets look more like WIN or more like FAIL.

from pattern.web import Twitter
from pattern.en import tag
from pattern.vector import KNN, count

twitter, knn = Twitter(), KNN()

for i in range(1, 3):
    for tweet in twitter.search('#win OR #fail', start=i, count=100):
        s = tweet.text.lower()
        p = '#win' in s and 'WIN' or 'FAIL'
        v = tag(s)
        v = [word for word, pos in v if pos == 'JJ'] # JJ = adjective
        v = count(v) # {'sweet': 1}
        if v:
            knn.train(v, type=p)

print(knn.classify('sweet potato burger'))
print(knn.classify('stupid autocorrect'))

Installation

Pattern supports Python 2.7 and Python 3.6. To install Pattern so that it is available in all your scripts, unzip the download and from the command line do:

cd pattern-3.6
python setup.py install

If you have pip, you can automatically download and install from the PyPI repository:

pip install pattern

If none of the above works, you can make Python aware of the module in three ways:

  • Put the pattern folder in the same folder as your script.
  • Put the pattern folder in the standard location for modules so it is available to all scripts:
    • c:\python36\Lib\site-packages\ (Windows),
    • /Library/Python/3.6/site-packages/ (Mac OS X),
    • /usr/lib/python3.6/site-packages/ (Unix).
  • Add the location of the module to sys.path in your script, before importing it:
MODULE = '/users/tom/desktop/pattern'
import sys; if MODULE not in sys.path: sys.path.append(MODULE)
from pattern.en import parsetree

Documentation

For documentation and examples see the user documentation.

Version

3.6

License

BSD, see LICENSE.txt for further details.

Reference

De Smedt, T., Daelemans, W. (2012). Pattern for Python. Journal of Machine Learning Research, 13, 2031–2035.

Contribute

The source code is hosted on GitHub and contributions or donations are welcomed.

Bundled dependencies

Pattern is bundled with the following data sets, algorithms and Python packages:

  • Brill tagger, Eric Brill
  • Brill tagger for Dutch, Jeroen Geertzen
  • Brill tagger for German, Gerold Schneider & Martin Volk
  • Brill tagger for Spanish, trained on Wikicorpus (Samuel Reese & Gemma Boleda et al.)
  • Brill tagger for French, trained on Lefff (Benoît Sagot & Lionel Clément et al.)
  • Brill tagger for Italian, mined from Wiktionary
  • English pluralization, Damian Conway
  • Spanish verb inflection, Fred Jehle
  • French verb inflection, Bob Salita
  • Graph JavaScript framework, Aslak Hellesoy & Dave Hoover
  • LIBSVM, Chih-Chung Chang & Chih-Jen Lin
  • LIBLINEAR, Rong-En Fan et al.
  • NetworkX centrality, Aric Hagberg, Dan Schult & Pieter Swart
  • spelling corrector, Peter Norvig

Acknowledgements

Authors:

Contributors (chronological):

  • Frederik De Bleser
  • Jason Wiener
  • Daniel Friesen
  • Jeroen Geertzen
  • Thomas Crombez
  • Ken Williams
  • Peteris Erins
  • Rajesh Nair
  • F. De Smedt
  • Radim Řehůřek
  • Tom Loredo
  • John DeBovis
  • Thomas Sileo
  • Gerold Schneider
  • Martin Volk
  • Samuel Joseph
  • Shubhanshu Mishra
  • Robert Elwell
  • Fred Jehle
  • Antoine Mazières + fabelier.org
  • Rémi de Zoeten + closealert.nl
  • Kenneth Koch
  • Jens Grivolla
  • Fabio Marfia
  • Steven Loria
  • Colin Molter + tevizz.com
  • Peter Bull
  • Maurizio Sambati
  • Dan Fu
  • Salvatore Di Dio
  • Vincent Van Asch
  • Frederik Elwert
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].