All Git Users → jsfenfen

7 open source projects by jsfenfen

1. 990 Xml Reader
IRSx: Turn the IRS' versioned XML 990 nonprofit annual tax returns into standardized python objects, json, or human readable text with original line number and description.
✭ 89
python
2. Whatwordwhere
Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.
✭ 80
html
3. Pdf bbox utils
Helpers to create .csv files of word-level bounding boxes from text-based pdfs, or from hocr output.
✭ 6
python
4. covid hospitals demographics
COVID-19 relevant data on hospital location / capacity, nursing home location / capacity, county demographics
5. 990-xml-database
Django app to consume and store 990 data and metadata
6. parsing-prickly-pdfs
NICAR 2016 talk about PDFs!
✭ 61
1-7 of 7 user projects