All Projects → NISH1001 → nepalimdb

NISH1001 / nepalimdb

Licence: GPL-3.0 license
Nepali IMDB crawler to extract Nepali movies data and analyze

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

nepalimdb

Nepali IMDB crawler to extract Nepali movies data and analyze

Dependencies

The crawler uses BeautifulSoup and requests modules for python3.
Install using requirements.txt as:

pip install -r requirements.txt

Crawler Usage

Run nepalimdb.py script using python:

python nepalimdb.py

Data

The data is dumped as JSON in data/nepali-movies.json.
The data consists of list of json object with each object holding a single movie with information like:

  • imdb_url
  • title
  • year
  • runtime
  • genre
  • rating
  • plot
  • votes

Here's the sample json object:

{
    "imdb_url": "https://www.imdb.com/title/tt6944688/?ref_=adv_li_tt",
    "title": "A Mero Hajur 2",
    "year": 2017,
    "runtime": "138 min",
    "genre": "Drama, Romance",
    "rating": 8.2,
    "plot": "A Man stalks a girl, after while they fall in love, but their relative don't want the girl to be with that man.",
    "votes": 164
},

You can find the data so far: here

Data Analysis

I have done basic analysis in this jupyter notebook.

Contributions

Feel free to use the data and the crawler any way you like. But, if you feel like giving me a credit, mention and this repo.
Pull requests are welcome. Feel free to tweak the code and optimize. You might send a pull request too.

Cheers...

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].