The objective is to process by generating textual description from an image – based on the objects and actions in the image. Using generative models so that it creates novel sentences. Pipeline type models uses two separate learning process, one for language modelling and other for image recognition. It first identifies objects in image and prov…

Stars: ✭ 20 (-66.67%)

Mutual labels: image-captioning

Image Caption Generator

A neural network to generate captions for an image using CNN and RNN with BEAM Search.

Stars: ✭ 126 (+110%)

Mutual labels: image-captioning

YANGstraight source

Analytic signal-based source information analysis for YANGstraight and real-time interactive tools

Stars: ✭ 31 (-48.33%)

Mutual labels: speech-analysis

Sightseq

Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection

Stars: ✭ 116 (+93.33%)

Mutual labels: image-captioning

Show-Attend-and-Tell

A PyTorch implementation of the paper Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

Stars: ✭ 58 (-3.33%)

Mutual labels: image-captioning

Video2description

Video to Text: Generates description in natural language for given video (Video Captioning)

Stars: ✭ 107 (+78.33%)

Mutual labels: image-captioning

Image-Captioning-with-Beam-Search

Generating image captions using Xception Network and Beam Search in Keras

Stars: ✭ 18 (-70%)

Mutual labels: image-captioning

Arnet

CVPR 2018 - Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present

Stars: ✭ 94 (+56.67%)

Mutual labels: image-captioning

Show and Tell

Show and Tell : A Neural Image Caption Generator

Stars: ✭ 74 (+23.33%)

Mutual labels: image-captioning

Automatic Image Captioning

Generating Captions for images using Deep Learning

Stars: ✭ 84 (+40%)

Mutual labels: image-captioning

Bayesian-Pitch-Tracking-Using-Harmonic-model

Pitch detection and pitch tracking, voicing unvoicing detection (VAD)，基音检测

Stars: ✭ 70 (+16.67%)

Mutual labels: speech-analysis

Cameramanager

Simple Swift class to provide all the configurations you need to create custom camera view in your app

Stars: ✭ 1,130 (+1783.33%)

Mutual labels: image-captioning

Awesome-Captioning

A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)

Stars: ✭ 56 (-6.67%)

Mutual labels: image-captioning

Image Captioning

Image Captioning: Implementing the Neural Image Caption Generator with python

Stars: ✭ 52 (-13.33%)

Mutual labels: image-captioning

Show Control And Tell

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019

Stars: ✭ 243 (+305%)

Mutual labels: image-captioning

Bottom Up Attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

Stars: ✭ 989 (+1548.33%)

Mutual labels: image-captioning

wavenet-classifier

Keras Implementation of Deepmind's WaveNet for Supervised Learning Tasks

Stars: ✭ 54 (-10%)

Mutual labels: speech-analysis

Im2p

Tensorflow implementation of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs

Stars: ✭ 15 (-75%)

Mutual labels: image-captioning

Caption generator

A modular library built on top of Keras and TensorFlow to generate a caption in natural language for any input image.

Stars: ✭ 243 (+305%)

Mutual labels: image-captioning

Show Attend And Tell

TensorFlow Implementation of "Show, Attend and Tell"

Stars: ✭ 869 (+1348.33%)

Mutual labels: image-captioning

MIA

Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" （NeurIPS 2019）

Stars: ✭ 57 (-5%)

Mutual labels: image-captioning

Omninet

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

Stars: ✭ 448 (+646.67%)

Mutual labels: image-captioning

Dataturks

ML data annotations made super easy for teams. Just upload data, add your team and build training/evaluation dataset in hours.

Stars: ✭ 200 (+233.33%)

Mutual labels: image-captioning

Oscar

Oscar and VinVL

Stars: ✭ 396 (+560%)

Mutual labels: image-captioning

magphase

MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.