All Projects → Suyash458 → Wiktionaryparser

Suyash458 / Wiktionaryparser

Licence: mit
A Python Wiktionary Parser

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Wiktionaryparser

Mediawiki
MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/
Stars: ✭ 89 (-60.27%)
Mutual labels:  mediawiki, parser
Mwparserfromhell
A Python parser for MediaWiki wikicode
Stars: ✭ 440 (+96.43%)
Mutual labels:  mediawiki, parser
Goose Parser
Universal scrapping tool, which allows you to extract data using multiple environments
Stars: ✭ 211 (-5.8%)
Mutual labels:  parser
Phonia
Phonia Toolkit is one of the most advanced toolkits to scan phone numbers using only free resources. The goal is to first gather standard information such as country, area, carrier and line type on any international phone numbers with a very good accuracy.
Stars: ✭ 221 (-1.34%)
Mutual labels:  parser
Ini
Package ini provides INI file read and write functionality in Go
Stars: ✭ 2,771 (+1137.05%)
Mutual labels:  parser
Mediawiki
🌻 The collaborative editing software that runs Wikipedia. Mirror from https://gerrit.wikimedia.org/g/mediawiki/core. See https://mediawiki.org/wiki/Developer_access for contributing.
Stars: ✭ 2,752 (+1128.57%)
Mutual labels:  mediawiki
Escaya
An blazing fast 100% spec compliant, incremental javascript parser written in Typescript
Stars: ✭ 217 (-3.12%)
Mutual labels:  parser
Anorm
The Anorm database library
Stars: ✭ 208 (-7.14%)
Mutual labels:  parser
Mwclient
Python client library to interface with the MediaWiki API
Stars: ✭ 221 (-1.34%)
Mutual labels:  mediawiki
Sharpyaml
SharpYaml is a .NET library for YAML compatible with CoreCLR
Stars: ✭ 217 (-3.12%)
Mutual labels:  parser
Vmime
VMime Mail Library
Stars: ✭ 218 (-2.68%)
Mutual labels:  parser
Godot Gdscript Toolkit
Independent set of GDScript tools - parser, linter and formatter
Stars: ✭ 214 (-4.46%)
Mutual labels:  parser
Htmr
Simple and lightweight (< 2kB) HTML string to React element conversion library
Stars: ✭ 214 (-4.46%)
Mutual labels:  parser
Saltwater
A C compiler written in Rust, with a focus on good error messages.
Stars: ✭ 219 (-2.23%)
Mutual labels:  parser
Lwesp
Lightweight Espressif AT parser library for ESP8266 and ESP32 devices.
Stars: ✭ 212 (-5.36%)
Mutual labels:  parser
Neodoc
Beautiful, hand-crafted commandline interfaces for node.js
Stars: ✭ 221 (-1.34%)
Mutual labels:  parser
Hyperformula
A complete, open-source Excel-like calculation engine written in TypeScript. Includes 380+ built-in functions. Maintained by the Handsontable team⚡
Stars: ✭ 210 (-6.25%)
Mutual labels:  parser
Nodemw
MediaWiki API client written in node.js
Stars: ✭ 216 (-3.57%)
Mutual labels:  mediawiki
Cowlib
Support library for manipulating Web protocols.
Stars: ✭ 219 (-2.23%)
Mutual labels:  parser
Rss
Library for serializing the RSS web content syndication format
Stars: ✭ 223 (-0.45%)
Mutual labels:  parser

Wiktionary Parser

A python project which downloads words from English Wiktionary (en.wiktionary.org) and parses articles' content in an easy to use JSON format. Right now, it parses etymologies, definitions, pronunciations, examples, audio links and related words.

Downloads

JSON structure

[{
    "pronunciations": {
        "text": ["pronunciation text"],
        "audio": ["pronunciation audio"]
    },
    "definitions": [{
        "relatedWords": [{
            "relationshipType": "word relationship type",
            "words": ["list of related words"]
        }],
        "text": ["list of definitions"],
        "partOfSpeech": "part of speech",
        "examples": ["list of examples"]
    }],
    "etymology": "etymology text",
}]

Installation

Using pip
  • run pip install wiktionaryparser
From Source
  • Clone the repo or download the zip
  • cd to the folder
  • run pip install -r "requirements.txt"

Usage

  • Import the WiktionaryParser class.
  • Initialize an object and use the fetch("word", "language") method.
  • The default language is English, it can be changed using the set_default_language method.
  • Include/exclude parts of speech to be parsed using include_part_of_speech(part_of_speech) and exclude_part_of_speech(part_of_speech)
  • Include/exclude relations to be parsed using include_relation(relation) and exclude_relation(relation)

Examples

>>> from wiktionaryparser import WiktionaryParser
>>> parser = WiktionaryParser()
>>> word = parser.fetch('test')
>>> another_word = parser.fetch('test', 'french')
>>> parser.set_default_language('french')
>>> parser.exclude_part_of_speech('noun')
>>> parser.include_relation('alternative forms')

Requirements

  • requests==2.20.0
  • beautifulsoup4==4.4.0

Contributions

If you want to add features/improvement or report issues, feel free to send a pull request!

License

Wiktionary Parser is licensed under MIT.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].