ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9

Stars: ✭ 68 (-76.06%)

Mutual labels: scraping

Tacred Relation

PyTorch implementation of the position-aware attention model for relation extraction

Stars: ✭ 271 (-4.58%)

Mutual labels: natural-language-processing

api-flight.com

Main API Flight Git Repository

Stars: ✭ 26 (-90.85%)

Mutual labels: scraping

Bluebert

BlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III).

Stars: ✭ 273 (-3.87%)

Mutual labels: natural-language-processing

scrapy-zyte-smartproxy

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy

Stars: ✭ 317 (+11.62%)

Mutual labels: scraping

Bist Parser

Graph-based and Transition-based dependency parsers based on BiLSTMs

Stars: ✭ 257 (-9.51%)

Mutual labels: natural-language-processing

dmi-instascraper

A GUI for Instaloader to scrape users and hashtags with on Instagram

Stars: ✭ 21 (-92.61%)

Mutual labels: scraping

facebook-discussion-tk

A collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.

Stars: ✭ 33 (-88.38%)

Mutual labels: scraping

Chatbot ner

chatbot_ner: Named Entity Recognition for chatbots.

Stars: ✭ 273 (-3.87%)

Mutual labels: natural-language-processing

policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

Stars: ✭ 22 (-92.25%)

Mutual labels: scraping

Adaptnlp

An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.

Stars: ✭ 278 (-2.11%)

Mutual labels: natural-language-processing

python-overwatch

A simple API for scraping Overwatch stats

Stars: ✭ 14 (-95.07%)

Mutual labels: scraping

Awesome Ai Awesomeness

A curated list of awesome awesomeness about artificial intelligence

Stars: ✭ 268 (-5.63%)

Mutual labels: natural-language-processing

Babler

Data Collection System For NLP/Speech Recognition

Stars: ✭ 21 (-92.61%)

Mutual labels: scraping

Languagecrunch

LanguageCrunch NLP server docker image

Stars: ✭ 281 (-1.06%)

Mutual labels: natural-language-processing

whatsapp-tracking

Scraping the status of WhatsApp contacts

Stars: ✭ 49 (-82.75%)

Mutual labels: scraping

Matterport3dsimulator

AI Research Platform for Reinforcement Learning from Real Panoramic Images.

Stars: ✭ 260 (-8.45%)

Mutual labels: natural-language-processing

pomp

Screen scraping and web crawling framework

Stars: ✭ 61 (-78.52%)

Mutual labels: scraping

Nlp tasks

Natural Language Processing Tasks and References

Stars: ✭ 2,968 (+945.07%)

Mutual labels: natural-language-processing

chirps

Twitter bot powering @arichduvet

Stars: ✭ 35 (-87.68%)

Mutual labels: scraping

Fakenewscorpus

A dataset of millions of news articles scraped from a curated list of data sources.

Stars: ✭ 255 (-10.21%)

Mutual labels: natural-language-processing

Scraper-Projects

🕸 List of mini projects that involve web scraping 🕸

Stars: ✭ 25 (-91.2%)

Mutual labels: scraping

instagram explorer

📷 An app to scrap instagram posts and analyze data.

Stars: ✭ 17 (-94.01%)

Mutual labels: scraping

Nlp Tutorial

Tutorial: Natural Language Processing in Python

Stars: ✭ 274 (-3.52%)

Mutual labels: natural-language-processing

jazz

The Scripting Engine that Combines Speed, Safety, and Simplicity

Stars: ✭ 132 (-53.52%)

Mutual labels: scraping

Awesome Distributed Deep Learning

A curated list of awesome Distributed Deep Learning resources.

Stars: ✭ 277 (-2.46%)

Mutual labels: natural-language-processing

bots-zoo

No description or website provided.

Stars: ✭ 59 (-79.23%)

Mutual labels: scraping

Olivia

💁‍♀️Your new best friend powered by an artificial neural network

Stars: ✭ 3,114 (+996.48%)

Mutual labels: natural-language-processing

scraper

Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.

Stars: ✭ 37 (-86.97%)

Mutual labels: scraping

Scrapy Crawlera

Crawlera middleware for Scrapy

Stars: ✭ 281 (-1.06%)

Mutual labels: scraping

memes-api

API for scrapping common meme sites

Stars: ✭ 17 (-94.01%)

Mutual labels: scraping

Awesomefakenews

This repository contains recent research on fake news.

Stars: ✭ 270 (-4.93%)

Mutual labels: natural-language-processing

webdext

Intelligent Web Data Extractor

Stars: ✭ 75 (-73.59%)

Mutual labels: scraping

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (-2.46%)

Mutual labels: scraping

PyLex

Perform lexical analysis on words, one word at a time.

Stars: ✭ 60 (-78.87%)

Mutual labels: scraping

Nlpython

This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"

Stars: ✭ 265 (-6.69%)

Mutual labels: natural-language-processing

Zeiver

A Scraper, Downloader, & Recorder for static open directories.

Stars: ✭ 14 (-95.07%)

Mutual labels: scraping

Link Grammar

The CMU Link Grammar natural language parser

Stars: ✭ 286 (+0.7%)

Mutual labels: natural-language-processing

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-94.72%)

Mutual labels: scraping

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+1010.56%)

Mutual labels: scraping

humanparser

Parse a human name string into salutation, first name, middle name, last name, suffix.

Stars: ✭ 78 (-72.54%)

Mutual labels: scraping

Pyswip

PySwip is a Python - SWI-Prolog bridge enabling to query SWI-Prolog in your Python programs. It features an (incomplete) SWI-Prolog foreign language interface, a utility class that makes it easy querying with Prolog and also a Pythonic interface.

Stars: ✭ 276 (-2.82%)

Mutual labels: natural-language-processing

dust

Archive web pages with all relevant assets or save as a single file HTML

Stars: ✭ 19 (-93.31%)

Mutual labels: scraping

Lda

LDA topic modeling for node.js

Stars: ✭ 262 (-7.75%)

Mutual labels: natural-language-processing

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.