All Projects → okfn-brasil → busca-querido-diario

okfn-brasil / busca-querido-diario

Licence: MIT license
Project to enable search of key words in text files extracted by the Querido Diário.

Programming Languages

python
139335 projects - #7 most used programming language

busca-querido-diario

Project to enable search of key words in text files extracted by the Querido Diário.

Dependencies

Setup

pip install -r requirements.txt

Run

From the output of the Querido Diário, it is possible to access files in text format of the Brazilian municipal gazettes. for example: scrapy crawl sc_florianopolis -a start_date='2020-05-05'

The above command collects all the diaries of Florianópolis city, as of the date 2020-05-05.

With the text files obtained using the scraper, it is necessary to point out where the files are for the collection of information made by elasticsearch: python load_data.py /home/user/path-to/diario-oficial/data/full/

Finally, here is the command to search for a word: python search.py covid-19

The result is a folder named with the word used in the search: covid-19/*

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].