All Projects → Clean Text → Similar Projects or Alternatives

885 Open source projects that are alternatives of or similar to Clean Text

Data Science
Collection of useful data science topics along with code and articles
Stars: ✭ 315 (+10.92%)
Mtnt
Code for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (-83.1%)
raspagem-de-dados-fatec
📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí
Stars: ✭ 22 (-92.25%)
Mutual labels:  scraping
Lingua Rs
👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike
Stars: ✭ 260 (-8.45%)
TorScrapper
A Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)
Stars: ✭ 24 (-91.55%)
Mutual labels:  scraping
ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-76.06%)
Mutual labels:  scraping
Tacred Relation
PyTorch implementation of the position-aware attention model for relation extraction
Stars: ✭ 271 (-4.58%)
api-flight.com
Main API Flight Git Repository
Stars: ✭ 26 (-90.85%)
Mutual labels:  scraping
Bluebert
BlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III).
Stars: ✭ 273 (-3.87%)
scrapy-zyte-smartproxy
Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy
Stars: ✭ 317 (+11.62%)
Mutual labels:  scraping
Bist Parser
Graph-based and Transition-based dependency parsers based on BiLSTMs
Stars: ✭ 257 (-9.51%)
dmi-instascraper
A GUI for Instaloader to scrape users and hashtags with on Instagram
Stars: ✭ 21 (-92.61%)
Mutual labels:  scraping
facebook-discussion-tk
A collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.
Stars: ✭ 33 (-88.38%)
Mutual labels:  scraping
Chatbot ner
chatbot_ner: Named Entity Recognition for chatbots.
Stars: ✭ 273 (-3.87%)
policy-data-analyzer
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-92.25%)
Mutual labels:  scraping
Adaptnlp
An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.
Stars: ✭ 278 (-2.11%)
python-overwatch
A simple API for scraping Overwatch stats
Stars: ✭ 14 (-95.07%)
Mutual labels:  scraping
Awesome Ai Awesomeness
A curated list of awesome awesomeness about artificial intelligence
Stars: ✭ 268 (-5.63%)
Babler
Data Collection System For NLP/Speech Recognition
Stars: ✭ 21 (-92.61%)
Mutual labels:  scraping
Languagecrunch
LanguageCrunch NLP server docker image
Stars: ✭ 281 (-1.06%)
whatsapp-tracking
Scraping the status of WhatsApp contacts
Stars: ✭ 49 (-82.75%)
Mutual labels:  scraping
Matterport3dsimulator
AI Research Platform for Reinforcement Learning from Real Panoramic Images.
Stars: ✭ 260 (-8.45%)
pomp
Screen scraping and web crawling framework
Stars: ✭ 61 (-78.52%)
Mutual labels:  scraping
Nlp tasks
Natural Language Processing Tasks and References
Stars: ✭ 2,968 (+945.07%)
chirps
Twitter bot powering @arichduvet
Stars: ✭ 35 (-87.68%)
Mutual labels:  scraping
Fakenewscorpus
A dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (-10.21%)
Scraper-Projects
🕸 List of mini projects that involve web scraping 🕸
Stars: ✭ 25 (-91.2%)
Mutual labels:  scraping
instagram explorer
📷 An app to scrap instagram posts and analyze data.
Stars: ✭ 17 (-94.01%)
Mutual labels:  scraping
Nlp Tutorial
Tutorial: Natural Language Processing in Python
Stars: ✭ 274 (-3.52%)
jazz
The Scripting Engine that Combines Speed, Safety, and Simplicity
Stars: ✭ 132 (-53.52%)
Mutual labels:  scraping
Awesome Distributed Deep Learning
A curated list of awesome Distributed Deep Learning resources.
Stars: ✭ 277 (-2.46%)
bots-zoo
No description or website provided.
Stars: ✭ 59 (-79.23%)
Mutual labels:  scraping
Olivia
💁‍♀️Your new best friend powered by an artificial neural network
Stars: ✭ 3,114 (+996.48%)
scraper
Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
Stars: ✭ 37 (-86.97%)
Mutual labels:  scraping
Scrapy Crawlera
Crawlera middleware for Scrapy
Stars: ✭ 281 (-1.06%)
Mutual labels:  scraping
memes-api
API for scrapping common meme sites
Stars: ✭ 17 (-94.01%)
Mutual labels:  scraping
Awesomefakenews
This repository contains recent research on fake news.
Stars: ✭ 270 (-4.93%)
webdext
Intelligent Web Data Extractor
Stars: ✭ 75 (-73.59%)
Mutual labels:  scraping
Gopa
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (-2.46%)
Mutual labels:  scraping
PyLex
Perform lexical analysis on words, one word at a time.
Stars: ✭ 60 (-78.87%)
Mutual labels:  scraping
Nlpython
This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Stars: ✭ 265 (-6.69%)
Zeiver
A Scraper, Downloader, & Recorder for static open directories.
Stars: ✭ 14 (-95.07%)
Mutual labels:  scraping
Link Grammar
The CMU Link Grammar natural language parser
Stars: ✭ 286 (+0.7%)
papercut
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-94.72%)
Mutual labels:  scraping
Apify Js
Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+1010.56%)
Mutual labels:  scraping
humanparser
Parse a human name string into salutation, first name, middle name, last name, suffix.
Stars: ✭ 78 (-72.54%)
Mutual labels:  scraping
Pyswip
PySwip is a Python - SWI-Prolog bridge enabling to query SWI-Prolog in your Python programs. It features an (incomplete) SWI-Prolog foreign language interface, a utility class that makes it easy querying with Prolog and also a Pythonic interface.
Stars: ✭ 276 (-2.82%)
dust
Archive web pages with all relevant assets or save as a single file HTML
Stars: ✭ 19 (-93.31%)
Mutual labels:  scraping
Lda
LDA topic modeling for node.js
Stars: ✭ 262 (-7.75%)
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-92.25%)
Mutual labels:  scraping
Lambdasoup
Functional HTML scraping and rewriting with CSS in OCaml
Stars: ✭ 280 (-1.41%)
Mutual labels:  scraping
shup
A POSIX shell script to parse HTML
Stars: ✭ 28 (-90.14%)
Mutual labels:  scraping
Ai Job Notes
AI算法岗求职攻略(涵盖准备攻略、刷题指南、内推和AI公司清单等资料)
Stars: ✭ 3,191 (+1023.59%)
image-collector
Download images from Google Image Search
Stars: ✭ 38 (-86.62%)
Mutual labels:  scraping
Autonlp
🤗 AutoNLP: train state-of-the-art natural language processing models and deploy them in a scalable environment automatically
Stars: ✭ 263 (-7.39%)
naos
📉 Uptime and error monitoring CLI
Stars: ✭ 30 (-89.44%)
Mutual labels:  scraping
Articutapi
API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 94% 以上,Recall 96% 以上的成績。
Stars: ✭ 252 (-11.27%)
Textract
extract text from any document. no muss. no fuss.
Stars: ✭ 3,165 (+1014.44%)
Oie Resources
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (-0.35%)
Swem
The Tensorflow code for this ACL 2018 paper: "Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms"
Stars: ✭ 279 (-1.76%)
1-60 of 885 similar projects