All Projects → TiesdeKok → Python_nlp_tutorial

TiesdeKok / Python_nlp_tutorial

This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Python nlp tutorial

Practical Machine Learning With Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Stars: ✭ 1,868 (+2494.44%)
Mutual labels:  jupyter-notebook, natural-language-processing, spacy, nltk
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+1472.22%)
Mutual labels:  jupyter-notebook, natural-language-processing, spacy, nltk
Nlp profiler
A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Stars: ✭ 181 (+151.39%)
Mutual labels:  jupyter-notebook, natural-language-processing, text-mining
Python Tutorial Notebooks
Python tutorials as Jupyter Notebooks for NLP, ML, AI
Stars: ✭ 52 (-27.78%)
Mutual labels:  jupyter-notebook, natural-language-processing, nltk
Pytextrank
Python implementation of TextRank for phrase extraction and summarization of text documents
Stars: ✭ 1,675 (+2226.39%)
Mutual labels:  jupyter-notebook, natural-language-processing, spacy
Nlpython
This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Stars: ✭ 265 (+268.06%)
Mutual labels:  jupyter-notebook, natural-language-processing, text-mining
Nlp Python Deep Learning
NLP in Python with Deep Learning
Stars: ✭ 374 (+419.44%)
Mutual labels:  jupyter-notebook, natural-language-processing, spacy
Nlp Notebooks
A collection of notebooks for Natural Language Processing from NLP Town
Stars: ✭ 513 (+612.5%)
Mutual labels:  jupyter-notebook, natural-language-processing, text-mining
Nlp In Practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Stars: ✭ 790 (+997.22%)
Mutual labels:  jupyter-notebook, natural-language-processing, text-mining
Sense2vec
🦆 Contextually-keyed word vectors
Stars: ✭ 1,184 (+1544.44%)
Mutual labels:  natural-language-processing, spacy
Nlp Various Tutorials
자연어 처리와 관련한 여러 튜토리얼 저장소
Stars: ✭ 52 (-27.78%)
Mutual labels:  jupyter-notebook, natural-language-processing
Fasttext multilingual
Multilingual word vectors in 78 languages
Stars: ✭ 1,067 (+1381.94%)
Mutual labels:  jupyter-notebook, natural-language-processing
Spark Nkp
Natural Korean Processor for Apache Spark
Stars: ✭ 50 (-30.56%)
Mutual labels:  natural-language-processing, text-mining
Spacy Lookups Data
📂 Additional lookup tables and data resources for spaCy
Stars: ✭ 48 (-33.33%)
Mutual labels:  natural-language-processing, spacy
Stocksight
Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
Stars: ✭ 1,037 (+1340.28%)
Mutual labels:  natural-language-processing, nltk
Emotion Detector
A python code to detect emotions from text
Stars: ✭ 54 (-25%)
Mutual labels:  jupyter-notebook, natural-language-processing
Whitehat
Information about my experiences on ethical hacking 💀
Stars: ✭ 54 (-25%)
Mutual labels:  jupyter-notebook, research
Nagisa Tutorial Pycon2019
Code for PyCon JP 2019 talk "Python による日本語自然言語処理 〜系列ラベリングによる実世界テキスト分析〜"
Stars: ✭ 46 (-36.11%)
Mutual labels:  jupyter-notebook, natural-language-processing
Nltk Book Resource
Notes and solutions to complement the official NLTK book
Stars: ✭ 54 (-25%)
Mutual labels:  natural-language-processing, nltk
Vietnamese Electra
Electra pre-trained model using Vietnamese corpus
Stars: ✭ 55 (-23.61%)
Mutual labels:  jupyter-notebook, natural-language-processing

Get started with Python for Text Mining (NLP)

Want to learn how to use Python for Text Mining / Natural Language Processing (NLP)?
This repository has everything that you need to get started!

Author: Ties de Kok (Personal Page)

These materials accompany a PhD session on NLP for Accounting Research: slides

Quick link to the notebook: open notebook

Table of contents

Introduction

The goal of this GitHub page is to provide you with everything you need to get started with Python and Natural Language Processing (NLP)

The following topics are discussed:

(Note: the neural network part is only a reference to the Stanford course CS224n)

Who is this repository for?

The topics and techniques demonstrated in this repository are primarily oriented towards empirical research projects in fields such as Accounting, Finance, Marketing, Political Science, and other Social Sciences.

However, many of the basics are also perfectly applicable if you are looking to use Python for any other type of Data Science!

How to use this repository?

This repository is written to facilitate learning by doing.

All the material is written up in a Jupyter Notebook. See: NLP_notebook.ipynb.
The topics are split up by task description.

It is best to view the notebook locally or on nbviewer using this link: click here

An environment.yml file is provided that you can install using conda, this will automatically install all the packages used in the notebook.

Instructions on how to install the environment are provided here: Install environment

Not yet familiar with the basic Python syntax?

Please check out my "Getting started with Python for Research" repository: click here

Using Jupyter

To run the provided notebook file you need to use Jupyter Lab or Jupyter Notebook.

Jupyter comes pre-installed with the Anaconda distribution so you should have everything already installed and ready to go. The environment.yml will also install Jupyter Lab if you prefer to use that.

What is the Jupyter Notebook?

From the Jupyter website:

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.

In other words, the Jupyter Notebook allows you to program Python code straight from your browser!

How does the Jupyter Notebook work in the background?

The diagram below sums up the basics components of Jupyter:

At the heart there is the Jupyter Server that handles everything, the Jupyter Notebook which is accessed and used through your browser, and the kernel that executes the code. We will be focusing on the natively included Python Kernel but Jupyter is language agnostic so you can also use it with other languages/software such as 'R'.

It is worth noting that in most cases you will be running the Jupyter Server on your own computer and will connect to it locally in your browser (i.e. you don't need to be connected to the internet). However, it is also possible to run the Jupyter Server on a different computer, for example a high performance computation server in the cloud, and connect to it over the internet.

How to start a Jupyter Notebook?

The primary method that I would recommend to start a Jupyter Notebook is to use the command line (terminal) directly:

  1. Open your command prompt / terminal (on Windows I recommend the Anaconda Prompt)
  2. Activate the environment conda activate PythonNLPTutorial
  3. cd (i.e. Change) to the desired starting directory
    for example: cd "C:\Files\Work\Project_1"
    Note: if you are changing do folder on another drive you might have to also switch drives by typing, for example, E:
  4. Start the Jupyter Notebook server by typing: jupyter notebook or jupyter lab

This should automatically open up the corresponding Jupyter Notebook/Lab in your default browser. You can also manually go to the Jupyter Notebook/Lab by going to localhost:8888 with your browser.

How to close a Jupyter Notebook/Lab server?

If you want to close down the Jupyter Server: open up the command prompt window that runs the server and press CTRL + C twice.
Make sure that you have saved any open Jupyter Notebooks!

How to use the Jupyter Notebook?

Some shortcuts are worth mentioning for reference purposes:

command mode --> enable by pressing esc
edit mode --> enable by pressing enter

command mode edit mode both modes
Y : cell to code Tab : code completion or indent Shift-Enter : run cell, select below
M : cell to markdown Shift-Tab : tooltip Ctrl-Enter : run cell
A : insert cell above Ctrl-A : select all
B : insert cell below Ctrl-Z : undo
X: cut selected cell

Code along!

Option 1: clone repository

You can essentially "download" the contents of this repository by cloning the repository.

You can do this by clicking "Clone or download" button and then "Download ZIP":

If you extract the downloaded ZIP to a folder you can start the Jupyter Notebook/Lab in that folder and access the notebook.

Environment

You can install the environment by following these steps:

  1. Make sure you have Anaconda installed (link)
  2. Open your command prompt / terminal (on Windows I recommend the Anaconda Prompt)
  3. cd (i.e. Change) to the folder where you extracted the ZIP file
    for example: cd "C:\Files\Work\Project_1"
    Note: if you are changing do folder on another drive you might have to also switch drives by typing, for example, E:
  4. Run the following command conda env create -f environment.yml
  5. Activate the environment with: conda activate PythonNLPTutorial

A full list of all the packages used is provided in the environment.yml file.

Option 2: use Binder

Binder

Note: some functionality might not work on Binder.

Questions?

If you have questions or experience problems please use the issues tab of this repository.

License

MIT - Ties de Kok - 2020

Special Thanks

https://github.com/teles/array-mixer for having an awesome readme that I used as a template.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].