All Projects → HuberTRoy → Seen

HuberTRoy / Seen

Licence: other
A lightweight crawling/spider framework for everyone(support JavaScript!).✨

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Seen

LiteNetwork
A simple and fast .NET networking library compatible with .NET Standard 2, .NET 5, 6 and 7.
Stars: ✭ 66 (+407.69%)
Mutual labels:  easy-to-use
brute-md5
Advanced, Light Weight & Extremely Fast MD5 Cracker/Decoder/Decryptor written in Python 3
Stars: ✭ 16 (+23.08%)
Mutual labels:  easy-to-use
Lang-app
Add a multi lang configuration to your WEB APP 'from scratch' [ANY FRAMEWORK, ANY PLUGIN, ANY API]
Stars: ✭ 15 (+15.38%)
Mutual labels:  easy-to-use
react-native-easybluetooth-classic
⚛ A Library for easy implementation of Serial Bluetooth Classic on React Native (Android Only).
Stars: ✭ 44 (+238.46%)
Mutual labels:  easy-to-use
Tkinter-Designer
An easy and fast way to create a Python GUI 🐍
Stars: ✭ 4,697 (+36030.77%)
Mutual labels:  easy-to-use
zcrawl
An open source web crawling platform
Stars: ✭ 21 (+61.54%)
Mutual labels:  web-crawling
Xception-with-Your-Own-Dataset
Easy-to-use scripts for training and inferencing with Xception on your own dataset
Stars: ✭ 51 (+292.31%)
Mutual labels:  easy-to-use
Benzaiboten-spot-trading-bot
A trading bot easy to use to be linked to your favorite exchange to automatize the trading on cryptocurrencies
Stars: ✭ 20 (+53.85%)
Mutual labels:  easy-to-use
birthday.py
🎉 A simple discord bot in discord.py that helps you understand the usage of SQL databases
Stars: ✭ 30 (+130.77%)
Mutual labels:  easy-to-use
sakura-dmhy
Sakura - 一个简单的工具
Stars: ✭ 29 (+123.08%)
Mutual labels:  easy-to-use
Sparkora
Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟
Stars: ✭ 51 (+292.31%)
Mutual labels:  easy-to-use
nightly-docker-rebuild
Use nightli.es 🌔 to rebuild N docker 🐋 images 📦 on hub.docker.com
Stars: ✭ 13 (+0%)
Mutual labels:  easy-to-use
Udacity-Data-Analyst-Nanodegree
Repository for the projects needed to complete the Data Analyst Nanodegree.
Stars: ✭ 31 (+138.46%)
Mutual labels:  web-crawling
AndroidVerify
Android library designed for rapid and customizable form validation.
Stars: ✭ 41 (+215.38%)
Mutual labels:  easy-to-use
activity-based-security-framework
Exadel Activity-based Security Framework
Stars: ✭ 17 (+30.77%)
Mutual labels:  easy-to-use
QArchive
Async C++ Cross-Platform library that modernizes libarchive using Qt5 🚀. Simply extracts 7z 🍔, Tarballs 🎱 and other supported formats by libarchive. ❤️
Stars: ✭ 66 (+407.69%)
Mutual labels:  easy-to-use
AppImageUpdater
AppImage Updater for Humans built with QML/C++ with Qt5 ❤️.
Stars: ✭ 31 (+138.46%)
Mutual labels:  easy-to-use
hacktoberfest-2019
You can check the video here: #hacktoberfest
Stars: ✭ 28 (+115.38%)
Mutual labels:  easy-to-use
query2report
Query2Report is a simple open source business intelligence platform that allows users to build report/dashboard for business analytics or enterprise reporting
Stars: ✭ 43 (+230.77%)
Mutual labels:  easy-to-use
discord.json
Discord.json | Make your own discord bot with json !
Stars: ✭ 27 (+107.69%)
Mutual labels:  easy-to-use

Seen

Seen is a lightweight web crawling framework for everyone. Written with asyncioaiohttp/requests.

It is useful for writing a web crawling quickly and get FULL JavaScript Support.

Working Process: workingProcess

Requirements:

  • Python 3.5+
  • aiohttp or requests
  • pyquery

Installation:

pip install seen

Get JavaScript support!

pip install pyppeteer

Usage:

  1. Write spider.py
from seen import Spider, Parser, Item, Css


class Post(Item):
    title = Css('title')
    img = Css('img', 'src')


    def save(self):

        print(self.result['title'])
        print(self.result['img'])


class MySpider(Spider):
    roots = 'https://www.v2ex.com'
    url_limit = ('www.v2ex.com')
    concurrency = 1
    # if you want to load JavaScript, set use_browser = True
    # by default is False.
    use_browser = False

    parsers = [Parser(Post)]


if __name__ == '__main__':
    spider = MySpider()

    spider.start()
  1. Run python spider.py.
  2. Check result.

Contribution

  • Pull request.
  • Open an issue.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].