A spider for China Judgements Online
This project is no longer maintained and for reference only
It is only used for personal study and technical exchange, and cannot be used for commercial purposes.
Overview
This is a spider for 中国裁判文书网.
Features
- Support IP proxy
- Support multiple processes
- Support full crawling
- Divide data according to decision time, region and court
Run
python spider.py -num_processes 1 -start_time 2016-1-2 -end_time 2016-1-2
Results
- raw data
- processed data