Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Albert-W → Python_crawler

Albert-W / Python_crawler

It's designed to be a simple, tiny, pratical python crawler using json and sqlite instead of mysql or mongdb. The destination website is Zhihu.com.

Programming Languages

javascript

184084 projects - #8 most used programming language

Labels

flask sqlite3 sqlalchemy zhihu

Projects that are alternatives of or similar to Python crawler

Flask Restplus Boilerplate

A boilerplate for flask restful web service

Stars: ✭ 466 (+935.56%)

Mutual labels: sqlalchemy, flask

Flask Marshmallow

Flask + marshmallow for beautiful APIs

Stars: ✭ 666 (+1380%)

Mutual labels: sqlalchemy, flask

Potion

Flask-Potion is a RESTful API framework for Flask and SQLAlchemy, Peewee or MongoEngine

Stars: ✭ 484 (+975.56%)

Mutual labels: sqlalchemy, flask

Data Driven Web Apps With Flask

Course demo code and other hand-out materials for our data-driven web apps in Flask course

Stars: ✭ 388 (+762.22%)

Mutual labels: sqlalchemy, flask

Flask Jwt Router

Flask JWT Router is a Python library that adds authorised routes to a Flask app.

Stars: ✭ 43 (-4.44%)

Mutual labels: sqlalchemy, flask

Mini Shop Server

基于 Flask 框架开发的微信小程序后端项目，用于构建小程序商城后台（电商相关；rbac权限管理；附带自动生成Swagger 风格的API 文档；可作「Python 项目毕设」;慕课网系列）---- 相关博客链接:🌟

Stars: ✭ 446 (+891.11%)

Mutual labels: sqlalchemy, flask

Flask Rest Jsonapi

Flask extension to build REST APIs around JSONAPI 1.0 specification.

Stars: ✭ 566 (+1157.78%)

Mutual labels: sqlalchemy, flask

Safrs

SqlAlchemy Flask-Restful Swagger Json:API OpenAPI

Stars: ✭ 255 (+466.67%)

Mutual labels: sqlalchemy, flask

Databook

A facebook for data

Stars: ✭ 26 (-42.22%)

Mutual labels: sqlalchemy, flask

Flask Sqlalchemy Booster

Collection of utilities and decorators which add extensive querying and serializing capabilities to Flask SQLalchemy models

Stars: ✭ 5 (-88.89%)

Mutual labels: sqlalchemy, flask

Enferno

A Python framework based on Flask microframework, with batteries included, and best practices in mind.

Stars: ✭ 385 (+755.56%)

Mutual labels: sqlalchemy, flask

Ecache

👏👏 Integrate cache(redis) [flask etc.] with SQLAlchemy.

Stars: ✭ 28 (-37.78%)

Mutual labels: sqlalchemy, flask

Flask Sqlalchemy

Adds SQLAlchemy support to Flask

Stars: ✭ 3,658 (+8028.89%)

Mutual labels: sqlalchemy, flask

Full Stack

Full stack, modern web application generator. Using Flask, PostgreSQL DB, Docker, Swagger, automatic HTTPS and more.

Stars: ✭ 451 (+902.22%)

Mutual labels: sqlalchemy, flask

Flask Sqlacodegen

🍶 Automatic model code generator for SQLAlchemy with Flask support

Stars: ✭ 283 (+528.89%)

Mutual labels: sqlalchemy, flask

The database toolkit for go

Stars: ✭ 524 (+1064.44%)

Mutual labels: sqlalchemy, sqlite3

nim-gatabase

Connection-Pooling Compile-Time ORM for Nim

Stars: ✭ 103 (+128.89%)

Mutual labels: sqlalchemy, sqlite3

flaskbooks

A very light social network & RESTful API for sharing books using flask!

Stars: ✭ 19 (-57.78%)

Mutual labels: sqlalchemy, sqlite3

Mixer

Mixer -- Is a fixtures replacement. Supported Django, Flask, SqlAlchemy and custom python objects.

Stars: ✭ 743 (+1551.11%)

Mutual labels: sqlalchemy, flask

Flask Bones

An example of a large scale Flask application using blueprints and extensions.

Stars: ✭ 849 (+1786.67%)

Mutual labels: sqlalchemy, flask

View All Similar Projects ➔

python_crawler

本项目旨要做一个轻量，易读，方便拓展的知乎爬虫。

设计之初就尽量避免引入额外的框架和数据库引擎，因此它是一个python原生爬虫，数据库采用的是最轻便的sqlLite。所有的定制信息都从config文件导入, 修改它可以实现定制功能。

效果展示

前端展示

数据库展示

前置条件

为方便数据库与对象的映射，引入了sqlalchemy; 为了提供网页服务器，采用了flask, 此外没有其他包了。

pip install sqlalchemy
pip install flask

文件介绍

根目录

zhihu.db 保存爬虫信息的 sqlite数据库文件
temp.json 保存不需要存入数据库的临时信息

backend

主要负责爬虫与持久化功能

config.py 所有的配置信息，都通过config.py 统一管理。修改config.py 可以拓展程序的功能。
create_table.py 设计表结构，并通过ORM 在数据库中创建表。
dbTool.py 对数据库的操作，包装成python 函数。
zhihu.py 全部爬虫功能实现

frontend

可视化展示的文件夹

templates/ 提供了模版html, 是前端展示的入口
static/ 包括图片，css, js 等资源文件
run.py Flask 路由的实现，包括两个功能:

1 向前端传递json 数据

2 向前端传递展示页面。

数据库设计

需要保存的字段

id = Column(Integer, primary_key = True, autoincrement = True)
articleId = Column(Integer)
authorName = Column(String(length = 32))
authorId = Column(Integer)
followers = Column(Integer)
createTime = Column(String)
createDate = Column(Text)
vote = Column(Integer)
content = Column(Text)

使用方法

修改config.py , 输入想爬的网页，对应的正则表达式。
执行create_table.py, 会生成数据库与表单。
执行zhihu.py, 会爬取对应网页，并输入到数据库。默认：zhihu.db
执行run.py, 启动网页服务器，通过浏览器访问。默认： http://127.0.0.1:5000/zhihu

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 45

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗