IATI.cloud

IATI.cloud extracts all published IATI XML files from the IATI Registry and makes them available in a normalised PostgreSQL database, that you can access using a RESTful API. The project also stores all the parsed data in Apache Solr cores, allowing for faster querying. Two APIs are currently encompassed by the IATI.cloud project.

IATI is a global aid transparency standard and it makes information about aid spending easier to access, re-use and understand the underlying data using a unified open standard. You can find more about the IATI data standard at: www.iatistandard.org

Requirements

Name	Required version	Installation instructions
Python	3.6.5	Python 3.6.5 Tip: for managing multiple versions of python you can use pyenv
PostgreSQL	latest	PostgreSQL
PostGIS	latest	PostGIS Might already be installed depending on the PostgreSQL installation done
RabbitMQ	latest	RabbitMQ
Apache Solr	8.2.0	Solr
Python requirements	Installed by requirements.txt	See instructions below
Diskspace	1GB of space is recommended to ensure the Repository, Postgres Database, Apache Solr and required services can be installed. Do keep in mind, parsing and indexing datasets does increase the overall size of the IATI.cloud project, which can reach up to or more than 80GB.	Not applicable

Setting up your IATI Cloud environment

Go to folder root/OIPA.
Create a virtual environment with the correct Python version, recommended name is ‘env’ ex:
```
apt install python3-virtualenv
virtualenv --python=/usr/bin/python3.6 env
```
Activate the virtual environment (ex: source env/bin/activate)

Install uwsgi manually:

add-apt-repository ppa:deadsnakes/ppa
apt-get update
apt-get install build-essential python3.6-dev
Pip install uwsgi

Install required libraries using pip install -r requirements.txt
Make sure the following services are running on your installation: PostgreSQL, (ex: sudo systemctl status postgresql )
Run pre-commit install --hook-type commit-msg
Create a PostgreSQL database
Add the following .env file to the current working directory:

OIPA_DB_NAME=oipa
OIPA_DB_USER=oipa
OIPA_DB_PASSWORD=oipa
DJANGO_SETTINGS_MODULE=OIPA.development_settings

Add the file “local_settings.py” with the following information to the folder root/OIPA/OIPA:

SOLR = {
  'indexing': True,
  'url': 'http://localhost:8983/solr',
  'cores': {
       'activity': 'activity',
       'budget': 'budget',
       'dataset': 'dataset',
       'organisation': 'organisation',
       'publisher': 'publisher',
       'result': 'result',
       'transaction': 'transaction',
  }
}
DOWNLOAD_DATASETS = False

Go back to the folder root/OIPA and run database migrations with the command python manage.py migrate
Start the development server with the command python manage.py runserver
Create a superuser account for django with the command python manage.py createsuperuser
Start RabbitMQ with brew service start rabbitmq on mac or sudo service rabbitmq-server start on linux.
Start Celery worker with the command: celery -A OIPA worker --loglevel=info --concurrency=10, change the concurrency to your liking.
Start Celery beat with the command: celery -A OIPA beat --loglevel=info -S django
Start Celery flower with the command: celery flower -A OIPA --port=5555
Navigate to your Solr installation.
To use Apache Solr you will need to create the following 7 cores:

activity
budget
dataset
organisation
publisher
result
transaction

To create a core :

Input the following command on the command line: bin/solr create -c [name of your core].
Copy the ‘managed-schema’ file from OIPA/solr/[name of your core]/conf/ and paste it in the server/solr/[name of your core]/conf/ folder of the solr core.

Run the command bin/solr start to run Solr.

You can now access the Django admin page at http://localhost:8000/admin/, the Flower Dashboard at http://localhost:5555 and the Apache Solr administrative dashboard at localhost:8983/solr/#/

Debugging Celery or Apache Solr

install telnet
add at the to-be-debugged line in your code :

from celery.contrib import rdb
rdb.set_trace()

When the code reaches the corresponding line, you will receive a notification in the terminal stating that they are waiting for the debugger at port [port number].
In another terminal you can then launch: telnet localhost [port number].
This will open the debugger.

Parsing/indexing the data.

This process is managed from the Django administration page. The following is a step by step description of everything that needs to be done to load the data inside your local postgres database.

Disable Apache Solr indexing within OIPA/OIPA/local_settings.py by changing ‘indexing’: True to False.
On the django administration page, run the task to import codelists.
- Wait for this to finish.
On the django administration page, force run the task to update exchange rates.
- Wait for this to finish.
- Activate a scheduled version of this task, the task should be run monthly, not strictly necessary on a local installation.
Enable Apache Solr indexing.
On the django administration page, run the task to import datasets.
- In case you want to make use of the IATI validator, make sure that you set DOWNLOAD_DATASETS = True in OIA/OIPA/local_settings.py, so the IATI validator can be used.
- Wait for this to finish.
Wait for the IATI Validator to finish its validation. We can check the status here. If no data is returned, it has finished.
If you want to parse ALL available datasets use the following: On the django administration page, run the task to validate the datasets. If you want to parse a specific organisation or a specific dataset , use those tasks.
- Wait for this to finish.
We can now parse and index the datasets that have been prepared in the previous steps. On the django administration page, run the task to parse all datasets.
- Wait for this to finish.

After this all the data is available within our database as well as Solr. Solr can now be used to query the data. Simply select the core containing the information you're interested in, go to the query tab and ask your question.

API Documentation

Full API documentation for iati.cloud can be found at docs.iati.cloud.

About the project

Website: www.iati.cloud
Authors: Zimmerman B.V.
License: AGPLv3 (see included LICENSE file for full license)
Github Repo: github.com/zimmerman-zimmerman/iati.cloud/
Bug Tracker: github.com/zimmerman-zimmerman/iati.cloud/issues

Can I contribute?

Yes! We are mainly looking for coders to help on the project. If you are a coder feel free to Fork the repository and send us your amazing Pull Requests!

How should I contribute?

Python already has clear PEP 8 code style guidelines, so it's difficult to add something to it, but there are certain key points to follow when contributing:

PEP 8 code style guidelines should always be followed. Tested with flake8 OIPA.
Commitlint is used to check your commit messages.
Always try to reference issues in commit messages or pull requests ("related to #614", "closes #619" and etc.).
Avoid huge code commits where the difference can not even be rendered by browser based web apps (Github for example). Smaller commits make it much easier to understand why and how the changes were made, why (if) it results in certain bugs and etc.
When developing a new feature, write at least some basic tests for it. This helps not to break other things in the future. Tests can be run with pytest
If there's a reason to commit code that is commented out (there usually should be none), always leave a "FIXME" or "TODO" comment so it's clear for other developers why this was done.
When using external dependencies that are not in PyPI (from Github for example), stick to a particular commit (i. e. git+https://github.com/Supervisor/supervisor@ec495be4e28c694af1e41514e08c03cf6f1496c8#egg=supervisor), so if the library is updated, it doesn't break everything
Automatic code quality / testing checks (continuous integration tools) are implemented to check all these things automatically when pushing / merging new branches. Quality is the key!

Running the tests

Pytest-django is used to run tests. This will be installed automatically when the project is set up. To run tests, from the top level directory of the project, run pytest OIPA/. If you are in the same directory where manage.py is, only running pytest will be sufficient. Refer to Pytest-django documentations for details.

Tip: to be able to use debuggers (f. ex. ipdb) with pytest, run it with -s option (to turn off capturing test output).

Testing / code quality settings can be found in the setup.cfg file. Test coverage settings (for pytest-cov plugin) can be found at .coveragerc file.

Who currently makes use of IATI.cloud?

Dutch Ministry of Foreign Affairs: www.openaid.nl
FCDO Devtracker: devtracker.dfid.gov.uk
UNESCO Transparency Portal: opendata.unesco.org
Netherlands Enterprise Agency: aiddata.rvo.nl
Mohinga AIMS: mohinga.info
UN-Habitat: open.unhabitat.org
Overseas Development Institute: ODI.org
UN Migration: IOM.int
AIDA AIDA.tools

& many others

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

zimmerman-team / IATI.cloud

Programming Languages

Labels

Projects that are alternatives of or similar to IATI.cloud