ckan-php-manager
A bunch of scripts to perform tasks using CKAN API and https://github.com/GSA/ckan-php-client
Requirements
- PHP 7.0+ : http://php.net
Installation
Clone repository
$ git clone https://github.com/GSA/ckan-php-manager.git
Composer
Use composer to install/update dependencies
If you don't have Composer, install it:
$ curl -sS https://getcomposer.org/installer | php
$ mv composer.phar /usr/local/bin/composer
Install dependencies:
$ composer install
Configuration
Copy config.sample.php to config.php. Update it with your custom values, if needed.
$ cp inc/config.sample.php inc/config.php
Usage
Export all packages by Agency name, including all Sub Agencies
- Update
cli/export_packages_by_org.php
, editing the title of exported organization ORGANIZATION_TO_EXPORT - Run importer using php
$ php cli/export_packages_by_org.php
Script is taking all terms, including sub-agencies from http://www.data.gov/app/themes/roots-nextdatagov/assets/Json/fed_agency.json and makes CKAN requests, looking for packages by these organization list.
Results can be found in /results/{timestamp} dir after script finished its work, including _{term}.log
with package counts for each agency.
DMS legacy tag
To add tag add_legacy_dms_and_make_private
to all datasets of some group:
- Update ORGANIZATION_TO_TAG in the
cli/add_legacy_dms_and_make_private.php
- Double check CKAN_URL and CKAN_API_KEY for editing datasets
- Run script
$ php cli/add_legacy_dms_and_make_private.php
Assign groups and category tags to datasets
-
Put csv files to /data dir, with
assign_<any-title>.csv
(must haveassign_
prefix) The format of these files must be:dataset, group, categories
First line is caption, leave the first line in each file:
dataset,group,categories
Then put one dataset per line.
-
Dataset can be: * Dataset url, ex. https://catalog.data.gov/dataset/food-access-research-atlas * Dataset name, ex. download-crossing-inventory-data-highway-rail-crossing * Dataset id
-
Group just one group per line. If you need to add multiple groups, you must create another row in csv with same dataset and another group, because all the categories are tagged by current row group. Make sure your group exist in your CKAN instance (to list all existing groups, go to http://catalog.data.gov/api/3/action/group_list?all_fields=true , replacing
catalog.data.gov
with your CKAN domain) -
Categories one of multiple categories per current row group, separated by semicolon
;
Example csv file:
dataset, group, categories https://catalog.data.gov/dataset/food-access-research-atlas,Agriculture,"Natural Resources and Environment" aerial-image-of-alaskas-arctic-coastal-plain-1955,Climate,"Arctic; Arctic Ocean, Sea Ice and Coasts; Permafrost and Arctic Landscapes" 28d30c1f-75a5-4042-b0fc-de26cc7d70f2,Climate,Arctic; Arctic Development and Transport
-
-
Double check CKAN_URL and CKAN_API_KEY for editing datasets, defined in
inc/config.php
-
Run script
$ php cli/tagging/assign_groups_and_tags.php
- Detailed logs and results are stored in folder
results/[time-stamp]_ASSIGN_GROUPS
Remove groups and category tags from datasets (revert previous script changes)
- Prepare same csv file as for previous script, and put them to /data dir, with
remove_<any-title>.csv
$ php cli/tagging/remove_groups_and_tags.php
- This command will remove listed categories from the dataset of the row. If an empty list of categories is provided, this command will remove the group and all categories from the dataset.
CKAN API DOCs
http://docs.ckan.org/en/latest/api/index.html
Docker setup
To minimize requirements on a system, we've added a minimal setup with docker-compose. This should replace the above usage instructions as the default workflow.
$ docker-compose build
$ docker-compose run --rm app php cli/harvest_stats_csv.php
Run the tests.
$ docker-compose run --rm app phpunit