Comprehensive Exam
Topic: Deep Learning and Applications to NLP
Category |
Paper |
Link |
Survey papers |
Bengio, Yoshua,
Aaron Courville, and Pascal Vincent. Representation learning: A review and new
perspectives. (2013): 1-1. |
[PDF] |
|
Bengio, Yoshua.
Learning deep architectures for AI.
Foundations and trends in Machine Learning 2.1 (2009): 1-127. |
[PDF] |
|
Erhan, D., Bengio,
Y., Courville, A., Manzagol,
P. A., Vincent, P., & Bengio, S. (2010). Why does unsupervised pre-training help
deep learning?.
The Journal of Machine Learning Research, 11, 625-660. |
[PDF] |
Deep belief networks |
Salakhutdinov, R. (2009). Learning deep generative models
(Doctoral dissertation, University of Toronto). |
[PDF] (thesis) |
|
Salakhutdinov, R., & Hinton, G. E.
(2009). Deep boltzmann
machines. In International Conference on Artificial Intelligence and
Statistics (pp. 448-455). |
[PDF] |
Large-scale |
Le, Q. V., Ranzato, M. A., Monga, R.,
Devin, M., Chen, K., Corrado, G. S., ... & Ng, A. Y. (2011). Building high-level features using large scale unsupervised learning.
arXiv preprint
arXiv:1112.6209. |
[PDF] |
Breakthroughs |
Hinton, G. E., Osindero, S. and Teh, Y., A fast learning algorithm for deep belief
nets Neural Computation 18:1527-1554, 2006 |
[PDF] |
|
Yoshua Bengio,
Pascal Lamblin, Dan Popovici
and Hugo Larochelle, Greedy Layer-Wise Training of Deep Networks, in J. Platt et al. (Eds), Advances in Neural Information Processing Systems
19 (NIPS 2006), pp. 153-160, MIT Press, 2007 |
[PDF] |
|
MarcAurelio Ranzato,
Christopher Poultney, Sumit Chopra and Yann LeCun Efficient Learning of Sparse Representations with an Energy-Based
Model, in J. Platt et al. (Eds), Advances in
Neural Information Processing Systems (NIPS 2006), MIT Press, 2007 |
[PDF] |
|
Lee, H., Battle, A.,
Raina, R., & Ng, A. (2006). Efficient
sparse coding algorithms. In Advances in neural information processing
systems (pp. 801-808). |
[PDF] |
|
Vincent, P., Larochelle, H., Bengio, Y.,
& Manzagol, P. A. (2008, July). Extracting and composing robust features
with denoising autoencoders.
In Proceedings of the 25th international conference on Machine learning (pp.
1096-1103). ACM. |
[PDF] |
Deep learning in NLP: word embeddings |
R. Collobert,
J. Weston, L. Bottou, M. Karlen,
K. Kavukcuoglu and P. Kuksa.
Natural Language Processing (Almost)
from Scratch, Journal of Machine Learning Research (JMLR), 2011. |
[PDF] |
|
Mnih, A., & Hinton, G.
(2007, June). Three new graphical
models for statistical language modelling. In Proceedings of the 24th
international conference on Machine learning (pp. 641-648). ACM. |
[PDF] |
|
Turian, J., Ratinov,
L., & Bengio, Y. (2010, July). Word representations: a simple and general
method for semi-supervised learning. In Proceedings of the 48th Annual
Meeting of the Association for Computational Linguistics (pp. 384-394).
Association for Computational Linguistics. |
[PDF] |
|
Mikolov, T., Karafiat,
M., Burget, L., Cernocky,
J., & Khudanpur, S. (2010). Recurrent neural network based language
model. In INTERSPEECH (pp. 1045-1048). |
[PDF] |
|
Mikolov, T., Yih,
W. T., & Zweig, G. (2013). Linguistic
regularities in continuous space word representations. In Proceedings of
NAACL-HLT (pp. 746-751). |
[PDF] |
|
Arisoy, E., Sainath,
T. N., Kingsbury, B., & Ramabhadran, B. (2012,
June). Deep neural network language
models. In Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever
Really Replace the N-gram Model? On the Future of Language Modeling for HLT (pp.
20-28). Association for Computational Linguistics. |
[PDF] |
|
Bordes, A., Glorot,
X., Weston, J., and Bengio, Y. (2012). Joint learning of words and meaning
representations for open-text semantic parsing. AIS-TATS2012. |
[PDF] |
Sentiment classification |
Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. Domain
adaptation for large-scale sentiment classification: A deep learning approach.
Proceedings of the 28th International Conference on Machine Learning
(ICML-11). 2011. |
[PDF] |
|
Maas, A. L., Daly, R. E.,
Pham, P. T., Huang, D., Ng, A. Y., & Potts, C. (2011, June). Learning word vectors for sentiment
analysis. In Proceedings of the 49th Annual Meeting of the Association for
Computational Linguistics: Human Language Technologies-Volume 1 (pp.
142-150). Association for Computational Linguistics. |
[PDF] |
|
Socher, R., Pennington, J., Huang, E. H., Ng, A. Y.,
& Manning, C. D. (2011, July). Semi-supervised
recursive autoencoders for predicting sentiment
distributions. In Proceedings of the Conference on Empirical Methods in Natural
Language Processing (pp. 151-161). Association for Computational Linguistics. |
[PDF] |
Spoken dialogue system |
Henderson, Matthew and
Thomson, Blaise and Young, Steve, Deep
Neural Network Approach for the Dialog State Tracking Challenge,
Proceedings of the SIGDIAL 2013 Conference |
[PDF] |
|
Hinton, G.,
Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly,
N., ... & Kingsbury, B. (2012). Deep
neural networks for acoustic modeling in speech recognition: The shared views
of four research groups. Signal Processing Magazine, IEEE, 29(6), 82-97. |
[PDF] |
Paraphrase detection |
Socher, R., Huang, E. H., Pennin, J., Manning, C. D., & Ng, A. (2011). Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in
Neural Information Processing Systems (pp. 801-809). |
[PDF] |
Parsing |
Socher, R., Bauer, J., Manning,
C. D., & Ng, A. Y. (2013). Parsing
with compositional vector grammars. In In Proceedings of the ACL
conference. |
[PDF] |