All Projects → edwardsamuel → Wilayah Administratif Indonesia

edwardsamuel / Wilayah Administratif Indonesia

Licence: mit
Data Provinsi, Kota/Kabupaten, Kecamatan, dan Kelurahan/Desa di Indonesia

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Wilayah Administratif Indonesia

Cdap
An open source framework for building data analytic applications.
Stars: ✭ 509 (-23.69%)
Mutual labels:  dataset
Cvat
Powerful and efficient Computer Vision Annotation Tool (CVAT)
Stars: ✭ 6,557 (+883.06%)
Mutual labels:  dataset
Uhttbarcodereference
Universe-HTT barcode reference
Stars: ✭ 634 (-4.95%)
Mutual labels:  dataset
Pycococreator
Helper functions to create COCO datasets
Stars: ✭ 530 (-20.54%)
Mutual labels:  dataset
Total Text Dataset
Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.
Stars: ✭ 580 (-13.04%)
Mutual labels:  dataset
Label Studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Stars: ✭ 7,264 (+989.06%)
Mutual labels:  dataset
Cluepretrainedmodels
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Stars: ✭ 493 (-26.09%)
Mutual labels:  dataset
Proteinnet
Standardized data set for machine learning of protein structure
Stars: ✭ 664 (-0.45%)
Mutual labels:  dataset
Open stt
Open STT
Stars: ✭ 584 (-12.44%)
Mutual labels:  dataset
Esc 50
ESC-50: Dataset for Environmental Sound Classification
Stars: ✭ 631 (-5.4%)
Mutual labels:  dataset
Awesome Twitter Data
A list of Twitter datasets and related resources.
Stars: ✭ 533 (-20.09%)
Mutual labels:  dataset
Hate Speech And Offensive Language
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
Stars: ✭ 543 (-18.59%)
Mutual labels:  dataset
Gensim Data
Data repository for pretrained NLP models and NLP corpora.
Stars: ✭ 622 (-6.75%)
Mutual labels:  dataset
Pokemon.json
Pokemon dataset in JSON.
Stars: ✭ 511 (-23.39%)
Mutual labels:  dataset
Devblogs
+2600 developer-related blogs and publications.
Stars: ✭ 637 (-4.5%)
Mutual labels:  dataset
Voice datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).
Stars: ✭ 494 (-25.94%)
Mutual labels:  dataset
Couplet Dataset
Dataset for couplets. 70万条对联数据库。
Stars: ✭ 589 (-11.69%)
Mutual labels:  dataset
Person search
Joint Detection and Identification Feature Learning for Person Search
Stars: ✭ 666 (-0.15%)
Mutual labels:  dataset
Awesome Project Ideas
Curated list of Machine Learning, NLP, Vision, Recommender Systems Project Ideas
Stars: ✭ 6,114 (+816.64%)
Mutual labels:  dataset
Awesome chinese medical nlp
中文医学NLP公开资源整理:术语集/语料库/词向量/预训练模型/知识图谱/命名实体识别/QA/信息抽取/模型/论文/etc
Stars: ✭ 623 (-6.6%)
Mutual labels:  dataset

Data Provinsi, Kota/Kabupaten, Kecamatan, dan Kelurahan/Desa di Indonesia

Data ini diambil dari situs Pemutakhiran MFD dan MBS Badan Pusat Statistik (http://mfdonline.bps.go.id/) pada 11 Januari 2018.

Administrative Subdivisions of Indonesia (Provinces, Regencies/Cities, Districts, Villages)

The data were taken from Central Agency on Statistics (BPS) - MFD and MBS Update (http://mfdonline.bps.go.id/) on 11th January 2018.

The data were curl-ed from BPS site:

curl http://mfdonline.bps.go.id/index.php?link=hasil_pencarian --data "pilihcari=desa&kata_kunci="

with a, i, u, e and o as the keywords.

Statistics

+------+---------------------------+----------------+-----------+----------------+
| Kode | Provinsi                  | Kabupaten/Kota | Kecamatan | Desa/Kelurahan |
+------+---------------------------+----------------+-----------+----------------+
| 11   | ACEH                      |             23 |       289 |           6509 |
| 12   | SUMATERA UTARA            |             33 |       448 |           6102 |
| 13   | SUMATERA BARAT            |             19 |       179 |           1160 |
| 14   | RIAU                      |             12 |       169 |           1876 |
| 15   | JAMBI                     |             11 |       141 |           1562 |
| 16   | SUMATERA SELATAN          |             17 |       236 |           3263 |
| 17   | BENGKULU                  |             10 |       128 |           1515 |
| 18   | LAMPUNG                   |             15 |       228 |           2642 |
| 19   | KEPULAUAN BANGKA BELITUNG |              7 |        47 |            366 |
| 21   | KEPULAUAN RIAU            |              7 |        70 |            395 |
| 31   | DKI JAKARTA               |              6 |        44 |            254 |
| 32   | JAWA BARAT                |             27 |       627 |           5832 |
| 33   | JAWA TENGAH               |             35 |       573 |           8008 |
| 34   | DI YOGYAKARTA             |              5 |        78 |            414 |
| 35   | JAWA TIMUR                |             38 |       666 |           7856 |
| 36   | BANTEN                    |              8 |       155 |           1501 |
| 51   | BALI                      |              9 |        57 |            653 |
| 52   | NUSA TENGGARA BARAT       |             10 |       116 |           1062 |
| 53   | NUSA TENGGARA TIMUR       |             22 |       307 |           3202 |
| 61   | KALIMANTAN BARAT          |             14 |       174 |           2073 |
| 62   | KALIMANTAN TENGAH         |             14 |       136 |           1537 |
| 63   | KALIMANTAN SELATAN        |             13 |       152 |           1971 |
| 64   | KALIMANTAN TIMUR          |             10 |       103 |           1002 |
| 65   | KALIMANTAN UTARA          |              5 |        53 |            466 |
| 71   | SULAWESI UTARA            |             15 |       171 |           1790 |
| 72   | SULAWESI TENGAH           |             13 |       175 |           1953 |
| 73   | SULAWESI SELATAN          |             24 |       307 |           2975 |
| 74   | SULAWESI TENGGARA         |             17 |       222 |           2301 |
| 75   | GORONTALO                 |              6 |        77 |            722 |
| 76   | SULAWESI BARAT            |              6 |        69 |            639 |
| 81   | MALUKU                    |             11 |       118 |           1180 |
| 82   | MALUKU UTARA              |             10 |       116 |           1155 |
| 91   | PAPUA BARAT               |             13 |       217 |           1730 |
| 94   | PAPUA                     |             29 |       567 |           4866 |
+------+---------------------------+----------------+-----------+----------------+

Note:

The data was provided as-it-is and looks like there are two anomalies:

There was entries that not in ASCII format.

72  SULAWESI TENGAH 09  KABUPATEN TOJO UNA-UNA  070 TOGEAN  016 TITIRIí POPOLION
94  PAPUA 33  KABUPATEN PUNCAK  042 MAGEÁBUME

There are duplicate village id.

91  PAPUA BARAT 07  KABUPATEN SORONG  182 WEMAK 005 KAMLIN
91  PAPUA BARAT 07  KABUPATEN SORONG  182 WEMAK 005 KWARI

91  PAPUA BARAT 09  KABUPATEN TAMBRAUW  070 KEBAR 015 ANARUM
91  PAPUA BARAT 09  KABUPATEN TAMBRAUW  070 KEBAR 015 JAMBUANI

The above statistics was generated by MySQL version and uses INSERT IGNORE statements. This will skip the duplicate items.

Generate new data

In order to generate new data:

cd scripts
pip install -r requirements.txt
./run.sh

We use mysql as default database hostname and root as default database username, but if you have different database setup on your local machine, you can simply run it with argument:

./run -h [my-db-host] -u [my-db-username]

You can also run docker-compose (more preferred):

docker-compose build
docker-compose up

License

  • The scripts are license under: MIT.
  • The data (CSV and SQL) are under: ODBL v1.0.
  • The source data is attributed to Badan Pusat Statistik (BPS) Indonesia.

Contributing

  1. Fork it (https://github.com/edwardsamuel/Wilayah-Administratif-Indonesia/fork).
  2. Create your feature branch (git checkout -b my-new-feature).
  3. Commit your changes (git commit -am 'Add some feature').
  4. Push to the branch (git push origin my-new-feature).
  5. Create a new Pull Request.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].