The Problem Statement:
Predict the answer of a open-ended question related to given a image.
VQA Library and Setup:
Reference Models:
1. neural-vqa
link: https://github.com/abhshkdz/neural-vqa
2.Deeper LSTM+ normalized CNN for Visual Question Answering
link: https://github.com/VT-vision-lab/VQA_LSTM_CNN
3. Hierarchical Question-Image Co-Attention for Visual Question Answering
link: https://github.com/jiasenlu/HieCoAttenVQA
4.Simple Baseline for Visual Question Answering
Link: https://github.com/metalbubble/VQAbaseline
5.Visual7W QA Models
link :https://github.com/yukezhu/visual7w-qa-models
6.VQA Demo
link: http://iamaaditya.github.io/2016/04/visual_question_answering_demo_notebook
7.Deep Learning for Visual Question Answering
link: https://github.com/avisingh599/visual-qa
Issue List:
List of References:
- L. Ma, Z. Lu, and H. Li., ‘‘Learning to Answer Questions From Image using Convolutional Neural Network”,CoRR abs/1506.00333, Nov, 2015.
- H. Gao, J. Mao, J. Zhou, Z. Huang, L. Wang and W. Xu.,‘‘Are you talking to a machine? dataset and methods for multilingual image question answering.”,arXiv 1505.05612v3, Nov, 2015.
- M. Ren, R. Kiros, and R. S. Zemel, ‘‘Exploring models and data for image question answering”,arXiv 1505.02074,2015.
- M. Malinowski, M. Rohrbach, and M. Fritz.,‘‘Ask your neurons: A neural-based approach to answering questions about images.”,arXiv 1505.01121, Nov, 2015.
Useful Links:
1. Memory Networks for Language Understanding, ICML Tutorial 2016
link: http://www.thespermwhale.com/jaseweston/icml2016/
2. End-To-End Memory Networks for Question Answering
link: https://github.com/vinhkhuc/MemN2N-babi-python
3.Implementing Dynamic memory networks
Link: https://yerevann.github.io/2016/02/05/implementing-dynamic-memory-networks/
MISC
http://www.arxiv-sanity.com/1606.03556
http://www.arxiv-sanity.com/1601.01705
http://www.arxiv-sanity.com/1606.02393
Attention
Deep learning
Image and word attention
http://yanran.li/peppypapers/2015/12/11/nips-2015-deep-learning-symposium-part-i.html
Compositional Semantic Parsing on Semi-Structured Tables
Click to access 1508.00305v1.pdf
A Deep Architecture for Semantic Parsing
Click to access 1404.7296v1.pdf
Question Answering over Knowledge Base with Neural Attention Combining Global Knowledge Information
Click to access 1606.00979v1.pdf
Recurrent Neural Network Encoder with Attention for Community Question Answering
Click to access 1603.07044v1.pdf
IMAge
Hierarchical Attention Networks
Click to access 1606.02393v1.pdf
Diversified Visual Attention Networks for Fine-Grained Object Classification
Click to access 1606.08572v1.pdf
VQA
Simple Baseline for Visual Question Answering
Click to access 1512.02167v2.pdf
Towards Transparent AI Systems: Interpreting Visual Question Answering Models
Click to access 13_Goyal_SUNw.pdf
https://computing.ece.vt.edu/~harsh/visualAttention/ProjectWebpage/#approach
http://cjds.github.io/image%20recognition/machine%20learning/2016/05/02/Visual-Question-Generation/
Visual Question Answering Literature Survey
http://iamaaditya.github.io/research/literature/
Attention
https://blog.heuritech.com/2016/01/20/attention-mechanism/
Good one for VQA
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
A Focused Dynamic Attention Model for Visual Question Answering
Visual7W: Grounded Question Answering in Images
Stacked Attention Networks for Image Question Answering
Where To Look: Focus Regions for Visual Question Answering
Revisiting Visual Question Answering Baselines
Simple Baseline for Visual Question Answering
Click to access zhu2016cvpr.pdf
Image Question Answering: A Visual Semantic Embedding Model and a New Dataset
Highway Networks for Visual Question Answering
Neural Self Talk: Image Understanding via Continuous Questioning and Answering
—————————————————————-
Role of Attention for Visual Question Answering Submitted By: Harsh Agrawal (harsh92)
https://computing.ece.vt.edu/~harsh/visualAttention/ProjectWebpage/#approach
http://cjds.github.io/image%20recognition/machine%20learning/2016/05/02/Visual-Question-Generation/
————————————————————-
Nice one
Compositional Memory for Visual Question Answering
Click to access 1412.7755v2.pdf
——————————————————————
Good links
https://github.com/kjw0612/awesome-deep-vision
———————————————————————–
The visual question answers loss function
Click to access 1606.03647.pdf
http://people.cs.vt.edu/~bhuang/courses/pgmsp16/projects/mahendru-pgmsp16
Click to access 1511.05676v1.pdf
https://www.google.co.in/search?q=visual+question+answer+loss+function&oq=visual+question+answer+loss+function&gs_l=serp.3…4666065.4679367.0.4681151.42.40.0.0.0.0.521.6024.0j4j16j3j0j1.24.0….0…1.1.64.serp..21.12.2779…0j35i39k1j0i67k1j0i22i30k1j0i22i10i30k1j33i21k1j0i7i30k1j0i8i7i30k1j0i8i30k1j30i10k1.7IoURnmZrO4
https://github.com/kundan2510/vqa_LSTM
—————————————————————-
NLTK
http://textminingonline.com/getting-started-with-word2vec-and-glove
http://textminingonline.com/dive-into-nltk-part-i-getting-started-with-nltk
—————————————————————
Torch7. Hello World, Neural Networks!
http://mdtux89.github.io/2015/12/11/torch-tutorial.html
———————————–
Learning Resources for NLP, Sentiment Analysis, and Deep Learning
———————————————————-
MIsc
https://github.com/vivanov879/word2vec
word_center = nn.Identity()()
word_outer = nn.Identity()()
x_center_ = Embedding(vocab_size, 100)(word_center)
x_center = nn.Linear(100, 50)(x_center_)
x_center = nn.Tanh()(x_center)
x_outer_ = Embedding(vocab_size, 100)(word_outer)
x_outer = nn.Linear(100, 50)(x_outer_)
x_outer = nn.Tanh()(x_outer)
x_center_minus = nn.MulConstant(-1)(x_center)
z = nn.CAddTable()({x_outer, x_center_minus})
z = nn.Power(2)(z)
z = nn.Sum(2)(z)
m = nn.gModule({word_center, word_outer}, {z, x_outer_, x_center_})
How A.I. will help kids on the Autism spectrum find employment
————————————————————–
Overfitting
https://www.quora.com/How-can-I-avoid-overfitting
https://www.researchgate.net/post/How_to_Avoid_Overfitting
http://www.kdnuggets.com/2015/01/clever-methods-overfitting-avoid.html
Click to access 2010Overfitting_0416.pdf
IMp Overfitting
http://cs231n.github.io/neural-networks-2/#reg
http://cs231n.github.io/neural-networks-1/
L2 regularisition
https://siavashk.github.io/2016/03/10/l21-regularization/
https://gitter.im/torch/torch7/archives/2015/06/13
https://computing.ece.vt.edu/~harsh/
Optimization
http://cs231n.github.io/neural-networks-3/
https://github.com/torch/optim/blob/master/doc/algos.md
https://github.com/torch/optim/blob/master/sgd.lua
batch size
Imp
Click to access DufourNick.pdf
http://cs231n.stanford.edu/reports.html
Click to access hyhieu_final.pdf
Movie QA
http://movieqa.cs.toronto.edu/home/
Jointly Modeling Embedding and Translation to Bridge Video and Language
Click to access 1505.01861.pdf
Sequence to Sequence – Video to Text
Click to access 1505.00487.pdf
Uncovering Temporal Context for Video Question and Answering
Two-Stream Convolutional Networks for Action Recognition in Videos
Beyond Short Snippets: Deep Networks for Video Classification
Learning Common Sense Through Visual Abstraction
Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors
torch-lrcn
https://github.com/garythung/torch-lrcn
ActivityNet
https://github.com/jrbtaylor/ActivityNet
Describing Videos by Exploiting Temporal Structure
SA-tensorflow
https://github.com/tsenghungchen/SA-tensorflow
https://github.com/yaoli/arctic-capgen-vid
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
https://www.cs.utexas.edu/~vsub/naacl15_project.html#code
video_to_sequence
https://github.com/jazzsaxmafia/video_to_sequence
https://github.com/vsubhashini/caffe/tree/recurrent/examples/s2vt
https://github.com/vsubhashini/caption-eval
https://vsubhashini.github.io/s2vt.html
Click to access 1505.00487v3.pdf
Click to access IVU_Convolutional_Networks_and_Video_Representations.pdf
Segment-CNN
https://github.com/zhengshou/scnn
https://github.com/tmbo/video-classification/blob/master/paper/paper.bib
https://github.com/gtoderici/sports-1m-dataset
http://cs.stanford.edu/people/karpathy/deepvideo/
artistic-videos
https://github.com/manuelruder/artistic-videos
https://github.com/yaoli/arctic-capgen-vid
———————————————————–
Word2VEC
http://textminingonline.com/getting-started-with-word2vec-and-glove
https://radimrehurek.com/gensim/models/word2vec.html
Attention In VQA
http://iamaaditya.github.io/research/literature/
https://github.com/HyeonwooNoh/DPPnet
https://github.com/ryankiros/skip-thoughts
https://libraries.io/github/johnny5550822/awesome-neat-rnn
https://blog.heuritech.com/2016/01/20/attention-mechanism/
Click to access 1511.02793v2.pdf
————————————————-
LSTM hyperparamater
http://deeplearning4j.org/lstm.html
https://github.com/torch/demos/tree/master/attention
https://github.com/torch/demos
—————————————————–
https://handong1587.github.io/deep_learning/2015/10/09/rnn-and-lstm.html
https://handong1587.github.io/deep_learning/2015/10/09/nlp.html
http://torch.ch/blog/2015/09/21/rmva.html
http://torch.ch/blog/2015/09/21/rmva.html
Github VQA link
https://github.com/handong1587/handong1587.github.io/tree/master/_posts/deep_learning
https://github.com/JamesChuanggg/awesome-vqa
https://github.com/vsubhashini/caffe/tree/recurrent/examples/youtube
https://www.cs.utexas.edu/~vsub/naacl15_project.html#code
Dataset
https://github.com/shuzi/insuranceQA
Thesis
Good Paper
Click to access cvpr2014-deepvideo-rahuls.pdf
IMP PPT
https://github.com/Atcold/torch-Video-Tutorials
For Video
https://github.com/anibali/torchvid
IMP MovieQA
https://github.com/makarandtapaswi/MovieQA_benchmark
For DVS:
Problem
(gedit:8803): WARNING **: Couldn’t connect to accessibility bus: Failed to connect to socket /tmp/dbus-WjKgPvfxFu: Connection refused
solution
The shell command:
export NO_AT_BRIDGE=1
——————————————————
From your output we see a “defunct”, which means the process has either completed its task or has been corrupted or killed, but its child processes are still running or these parent process is monitoring its child process. To kill this kind of process kill -9 PID don’t work, you can try to kill with this command but it will show this again and again.
Determine which is the parent process of this defunct process and kill it. To know this run the command:
ps -ef | grep defunct
UID PID PPID C STIME TTY TIME CMD
1000 637 27872 0 Oct12 ? 00:00:04 [chrome] <defunct>
1000 1808 1777 0 Oct04 ? 00:00:00 [zeitgeist-datah] <defunct>
Then kill -9 637 27872 then verify the defunct process is gone by ps -ef | grep defunct
ps -ef | grep defunct
ps -xal |grep defunct
ps -u
- First find the process id of firefox using the following command in any directory:
pidof firefox
- Kill firefox process using the following command in any directory:
kill [firefox pid]
The easiest solution for a program that is not responding would be:
killall -9 firefox