All Projects → sngjuk → Meme Glossary

sngjuk / Meme Glossary

Licence: mit
Meme serving with NLP

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Meme Glossary

Mcavoy
Discover what visitors are searching for on your WordPress site.
Stars: ✭ 24 (-20%)
Mutual labels:  search
Wechat
🔥 iOS 利用MVVM + RAC + ViewModel-Based Navigation来搭建微信(WeChat 7.0.0+)的整体基本架构,以及实现微信朋友圈、通讯录、下拉小程序、搜索等主要功能,代码规范惊为天人、注释详解令人发指、细节处理精益求精、核心功能配备文档、接近98%还原度的原生App视觉体验,代码不多,注释多。(持续更新,敬请期待,欢迎Star和Fork…)
Stars: ✭ 870 (+2800%)
Mutual labels:  search
Opensse
Open Sketch Search Engine- 3D object retrieval based on sketch image as input
Stars: ✭ 883 (+2843.33%)
Mutual labels:  search
Multisearchview
Yet another built-in animated search view for Android.
Stars: ✭ 837 (+2690%)
Mutual labels:  search
Pytelbot
A playful bot in telegram
Stars: ✭ 12 (-60%)
Mutual labels:  memes
Vscode Tsquery
TSQuery extension for Visual Studio Code
Stars: ✭ 13 (-56.67%)
Mutual labels:  search
Alfred Unicode
Preview Unicode characters and emoji in Alfred
Stars: ✭ 23 (-23.33%)
Mutual labels:  search
Pelias Android Sdk
Android sdk for pelias
Stars: ✭ 20 (-33.33%)
Mutual labels:  search
Vanilla Select
Standalone replacement for select boxes.
Stars: ✭ 12 (-60%)
Mutual labels:  search
Flexsearch
Next-Generation full text search library for Browser and Node.js
Stars: ✭ 8,108 (+26926.67%)
Mutual labels:  search
Better Search
Better Search WordPress plugin
Stars: ✭ 9 (-70%)
Mutual labels:  search
Splatoon 2 Meme Mod
Splatoon 2 mod about memes.
Stars: ✭ 11 (-63.33%)
Mutual labels:  memes
Txtai
AI-powered search engine
Stars: ✭ 874 (+2813.33%)
Mutual labels:  search
Blast
Blast is a full text search and indexing server, written in Go, built on top of Bleve.
Stars: ✭ 934 (+3013.33%)
Mutual labels:  search
Spimedb
EXPLORE & EDIT REALITY
Stars: ✭ 14 (-53.33%)
Mutual labels:  search
Search Ui
🔍 A set of UI components to build a fully customized search!
Stars: ✭ 24 (-20%)
Mutual labels:  search
Ultrix
Ultrix is a meme website for collecting memes and sharing them with friends on the website.
Stars: ✭ 13 (-56.67%)
Mutual labels:  memes
App Search Node
Elastic App Search Official Node.js Client
Stars: ✭ 29 (-3.33%)
Mutual labels:  search
Scrapy Azuresearch Crawler Samples
Scrapy as a Web Crawler for Azure Search Samples
Stars: ✭ 20 (-33.33%)
Mutual labels:  search
Cerebro
Open-source productivity booster with a brain
Stars: ✭ 7,181 (+23836.67%)
Mutual labels:  search

meme-glossary

  • Retrieve meme-image with query sentence embedding over zmq.
  • Generate memes from comics.

Install

python3 is required.

Client only usage :

git clone https://github.com/sngjuk/meme-glossary.git
./install.sh client

Full usage :

git clone --recurse-submodules https://github.com/sngjuk/meme-glossary.git
./install.sh all

Usage :

Please check ./example folder.

Client :

import client
mc = client.MgClient(ip='localhost', port=5555)

# Query with sentence.
mc.dank(['Nice to meet you'], max_img=3, min_sim=0.15)

# Random meme
mc.random()

# Save as a file.
mc.save_meme(img_data, 'image.jpg')

Server :

./app.py --model_path= model.bin --meme_dir= meme_dir --xml_dir= xml_dir --vec_path= meme_voca.vec

Example : (check in example folder)

Prepare Memes from comic book.

1. Crawl comics from web. (Please find the source for memes.. this script crawls Korean comics)
Output : Comic book image files. (1_original_comics)

prepare_memes/comics_crawler.py

2. Cut comic book into scenes.
Input : Comic book image files. (1_original_comics)
Output : Cut Scenes. (2_kumiko_cut_meme)

prepare_memes/cutter.py --kumiko= /prepare_memes/kumiko --meme_dir= 1_original_comics --out_dir= 2_kumiko_cut_meme

3. Filter error cuts manually. (GUI environment is recommended.)
Input : Cut Scenes. (2_kumiko_cut_meme)
Output : Manually filtered memes. (3_manual_filtered_meme)

4-1. Label with Google vision cloud API. (Please check --lang_hint and pricing policy in this repo's wiki page .)
Input : Manually filtered memes. (3_manual_filtered_meme)
Output : Meme label xml. (4_label_xml)

prepare_memes/auto_labeler.py --meme_dir= 3_manual_filtered_meme --output_dir= 4_label_xml --lang_hint= ' '

4-2. or Label Manually.

prepare_memes/manual_labeler.py --meme_dir= 3_manual_filtered_meme --output_dir= 4_label_xml

4-3. or Label with Rect Label. (xml format is sharable with Rect Label).
https://rectlabel.com/

5. Generate .vec for similiarity search. {episode/filename : vector}
Input : Meme label xml. (4_label_xml), Sentence embedding model. (model.bin) -please check below.
Output : .vec file for similiarity search. (5_meme_voca.vec)

prepare_memes/xml2vec.py --model_path= model.bin --xml_dir= 4_label_xml --vec_path= 5_meme_voca.vec

Prepare Sentence Embedding Model.

Pretrained models : Pretrained Eng model
Note : To train a new sent2vec model, you first need some large training text file. This file should contain one sentence per line. The provided code does not perform tokenization and lowercasing, you have to preprocess your input data yourself.
*You can replace nlp model(not sent2vec) by simply chainging /server/nlp/model.py

한국어 모델

  1. Pretrained KR model(전처리한 나무위키 텍스트 220mb (부족한 데이터양으로 학습 후 모르는 단어가 꽤나 많습니다)

  2. Pretrained decomposed KR model (자소분해 후 학습된 모델, 위 모델보다 나은 성능이지만 OOV 문제는 같습니다)
    *자소 분해된 쿼리를 사용하기위해 xml2vec.py, app.py에 --lang=ko 옵션을 줍니다.

Done! execute server :

./app.py --model_path model.bin --meme_dir= 3_manual_filtered_meme --xml_dir= 4_label_xml --vec_path= 5_meme_voca.vec (--lang=ko <- 자소분해모델 사용시 추가)

Test with Client :

example

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].