The ultimate goal of AI is a world taking care of itself, so the humans can finally become a pack of carefree, barefooted, long-haired hippies. (← irony)

Posts

Machine Translation Weekly 35: Word Translation of Transformer Layers

I always rationalized the encoder-decoder architecture as a conditional language model. The decoder is the language model, the component that “knows” the target language (whatever it means) and uses the encoder as an external memory, so it does not forget what it was talking about....

mt-weekly  en 

Machine Translation Weekly 34: Echo State Neural Machine Translation

This week I am going to write a few notes on paper Echo State Neural Machine Translation by Google Research from some weeks ago. Echo state networks are a rather weird idea: initialize the parameters of a recurrent neural network randomly, keep them fixed and...

mt-weekly  en 

Machine Translation Weekly 33: Document-level translation via self-fine-tuning

This week, I will have a look at a recent pre-print that presents an interesting method for document-level machine translation that is quite different from all previous ones. The title of the paper Capturing document context inside sentence-level neural machine translation models with self-training and...

mt-weekly  en 

Machine Translation Weekly 32: BERT in Machine Translation

I am pretty sure everyone tried to use BERT as a machine translation encoder and who says otherwise, keeps trying. Representations from BERT brought improvement in most natural language processing tasks, why would machine translation be an exception? Well, because it is not that easy....

mt-weekly  en 

Machine Translation Weekly 31: Fixing Transformer's Heads

This week, I am going to comment on a paper that appeared on arXiv on Tuesday and raised quite a lot of interest on Twitter. The title of the paper is Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation and it describes work that has...

mt-weekly  en 

Machine Translation Weekly 30: A Multilignual View of Unsupervised Machine Translation

This week, it will be the third time in recent weeks when I am going to review a paper that primarily focuses on unsupervised machine translation. The title of the paper is A Multilingual View on Unsupervised Machine Translation and it describes again work done...

mt-weekly  en 

Machine Translation Weekly 29: Encode, Tag, Realize - sequence transformation by learned edit operations

This week I will have a look at a paper from last year’s EMNLP that introduces a relatively simple architecture for sequence generation when the target sequence is very similar to the source sequence. The title of the paper is “Encode, Tag, Realize: High-Precisions Text...

mt-weekly  en 

Machine Translation Weekly 28: mBART – Multilingual Pretraining of Sequence-to-sequence Models

The trend of model-pretraining and task-specific fine-tuning finally fully hit machine translation as well. After begin used for some time for unsupervised machine translation training, at the end of January Facebook published a model, a pre-trained sequence-to-sequence model for 25 languages at the same time....

mt-weekly  en 

Machine Translation Weekly 27: Explaining the Reformer

This week I will review a paper that is not primarily about machine translation but about a neural architecture but can make a big impact on machine translation and natural language processing in general. This post is about Google’s Reformer, a neural architecture that is...

mt-weekly  en 

Machine Translation Weekly 26: Unsupervised Machine Translation

One of the hottest topics in machine translation and one of the topics I ignored so far is unsupervised machine translation, i.e., machine translation trained without the use of any parallel data. I will go through a seven-month-old paper published at this year’s ACL titled...

mt-weekly  en 

Machine Translation Weekly 25: Weaknesses of Reinforcement Learning for NMT

Back in 2016, one of the trendy topics was reinforcement learning and other forms of optimizing NMT directly towards some more relevant metrics rather than using cross-entropy of the conditional word distributions. Standard machine translation models are trained to maximize single-word conditional distribution, which is...

mt-weekly  en 

Machine Translation Weekly 24: Cross-Lingual Ability of Multilingual BERT

After the Christmas holidays, I will once again have a look at multilingual BERT. I already discussed multilingual BERT on this blog once when I reviewed a paper that explored some cross-lingual and multilingual properties of multilingual BERT. This week’s paper does more in-depth experiments...

mt-weekly  en 

Machine Translation Weekly 23: Word Sense Disambiguation in Neural Machine Translation

This week, I would like to give some thoughts about word senses and representation contextualization in machine translation. I will start by explaining why I think the current way of writing about word senses in NLP is kind of misleading and why I think we...

mt-weekly  en 

Machine Translation Weekly 22: Understanding Knowledge Distillation in Non-Autoregressive Machine Translation

Last week, I discussed a paper claiming that forward-translation might be a better data augmentation technique than back-translation. This week, I will follow with a paper that touches a similar topic, but in a slightly different context. The title of the paper is Understanding Knowledge Distillation in Non-Autoregressive Machine Translation and was...

mt-weekly  en 

Machine Translation Weekly 21: On Translationese and Back-Translation

Does WMT speak translationese? And who else speaks translationese? Is the success of back-translation fake news? These are the questions that implicitly ask authors of a paper called Domain, Translationese and Noise in Synthetic Data for Neural Machine Translation that was uploaded on arXiv earlier...

mt-weekly  en 

Machine Translation Weekly 20: Search and Model Errors in Neural Machine translation

This week, I will have a look at a paper from this year’s EMNLP that got a lot of attention on Twitter this week. If there was an award for the most disturbing machine translation paper, this would be a good candidate. The title of...

mt-weekly  en 

Machine Translation Weekly 19: Domain Robustness

This week, I will briefly have a look at a paper that discusses another major problem of current machine translation which is domain robustness. The problem is very well analyzed in a paper from the University of Zurich called Domain Robustness in Neural Machine Translation...

mt-weekly  en 

Machine Translation Weekly 18: BPE Dropout

Everyone who followed natural language processing on Twitter last week must have noticed a paper called BPE-Dropout: Simple and effective Subword Regularizations that introduces a simple way of adding stochastic noise into text segmentation to increase model robustness. It sounds complicated, but it is fairly easy. As...

mt-weekly  en 

Machine Translation Weekly 17: When is Document-Level Context Useful?

One of the biggest limitations of current machine translation systems is they only work with isolated sentences. The systems need to guess when it comes to phenomena that cross the (often rather arbitrary) sentence boundaries. The typical example that is mentioned everywhere is the translation...

mt-weekly  en 

Machine Translation Weekly 16: Hybrid character-level and word-level machine translation

One of the topics I am currently dealing with in my research is character-level modeling for neural machine translation. Therefore, I was glad to see a paper that appeared on arXiv last week called On the Importance of Word Boundaries in Character-level Neural Machine Translation that shows an interesting...

mt-weekly  en 

Machine Translation Weekly 15: How Multilingual is Multiligual BERT?

This week, I will slightly depart from machine translation and have a look at a paper How Multilingual is Multilingual BERT by Google Research. BERT, the Sesame Street muppet that recently colonized the whole area of natural language processing is a model trained to predict...

mt-weekly  en 

Machine Translation Weekly 14: Modeling Confidence in Sequence-to-Sequence Models

Neural machine translation is based on machine learning—we collect training data, pairs of parallel sentences which we hope represent how language is used in the two languages, and train models using the data. When the model is trained, the more the input resembles the sentences...

mt-weekly  en 

Machine Translation Weekly 13: Long Distance Dependencies

Let us follow up on the gender paper and have a look at other cases where machine translation does not work as well as we would like it to work. This time, we will have a look at a paper that talks about grammatically complex...

mt-weekly  en 

Machine Translation Weekly 12: Memory-Augmented Networks

Five years ago when deep learning slowly started to be cool, there was a paper called Neural Turing Machines (which are not really Turing machines, but at least they are neural in a narrow technical sense). The paper left me with a foolishly naive impression...

mt-weekly  en 

Machine Translation Weekly 11: Gender and Machine Translation

It’s time to talk about gender—why things go wrong with gender in machine translation and what people do about it. Some languages have gendered nouns (German), some have gendered almost everything (Czech, French) and some only few pronouns (English). Let’s say you want to translate...

mt-weekly  en 

Machine Translation Weekly 10: Massively Multilingual Neural Machine Translation

The holiday period is over and I almost settled in my new place of operation which is the Ludwig-Maximilian University of Munich, and now there is nothing that can prevent me from continuing with weekly reports on what is new in the world of machine...

mt-weekly  en 

Machine Translation Weekly 9: Shared Task on Machine Translation Robustness

Machine translation is typically trained on bilingual data that can be found on the Internet. It mostly comes from international government and non-government organizations, commercial web presentations, books, movie subtitles, etc. Therefore, most of the text is quite formal and almost without typos and certainly...

mt-weekly  en 

Machine Translation Weekly 8: A Generalized Framework of Sequence Generation

This week’s post contains more math than usually. I will talk about a paper that unifies several decoding algorithms in MT using one simple equation. The paper is called A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models, it comes from New...

mt-weekly  en 

Machine Translation Weekly 7: Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

Remember two years ago when all tech-related servers enthusiastically reported that a translator by Google created its own language? These were based on a paper that was published in TAACL in summer 2017 after its pre-print was available on arXiv since November 2016. The paper...

mt-weekly  en 

Machine Translation Weekly 6: Probing the Need for Visual Context in Multimodal Machine Translation

This week, we will have a look at a paper that won the best short paper award at NAACL 2019. The name of the paper is Probing the Need for Visual Context in Multimodal Machine Translation and it was written by friends of mine from...

mt-weekly  en 

Machine Translation Weekly 5: Revisiting Low-Resource Neural Machine Translation

Let’s continue with pre-prints of papers which are going to appear at ACL this year and have a look at another paper that comes from the University of Edinburgh, titled Revisiting Low-Resource Neural Machine Translation: A Case Study. This paper is a reaction to an...

mt-weekly  en 

Machine Translation Weekly 4: Analyzing Multi-Head Self-Attention

With the ACL camera-ready deadline slowly approaching, future ACL papers start to pop up on arXiv. One of those which went public just a few days ago is a paper called Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned...

mt-weekly  en 

Machine Translation Weekly 3: Constant-Time Machine Translation with Conditional Masked Language Models

This week, we will have a look at a brand-new method for non-autoregressive machine translation published a few weeks ago on arXiv by Facebook AI Research, two days before the anonymity period for the EMNLP conference. Most models for neural machine translation work autoregressively. When...

mt-weekly  en 

Machine Translation Weekly 2: BERTScore

Last week, there was a paper on arXiv that introduces a method for MT evaluation using BERT sentence representation. The metric seems to be the new state of the art in MT evaluation. Its name is BERTScore and was done at Cornell University. MT evaluation...

mt-weekly  en 

Machine Translation Weekly 1: Bidirectional Decoding

This is the first post from a series in which I will try to come up with summaries of some of the latest papers and other news on machine translation. The main goal of this exercise is to force myself to read new papers regularly...

mt-weekly  en 

Tak trochu fake news o umělé inteligenci

English version of the post Když si v médiích čtu o technologiích, které využívají strojové učení, a o umělé inteligenci, jak se teď říká, často se divím, jak jsou zprávy nejen zjednodušující, ale především zavádějící. Určitě se to děje ve všech oborech, nicméně rozhořčování se...

popularization  cs 

Kind of fake news on artificial intelligence

Česká verze příspěvku While reading news stories on research or products involving deep learning, I get often surprised how inaccurate and misleading the news stories are. It is probably a problem of almost all expert fields which happen to appear in media, luckily they do...

popularization  en 

Nesnesitelná soutěživost umělých inteligentů

English version of the post Odmala jsem si myslel, že biatlon je divný sport. Vrtalo mi hlavou, jak někoho napadlo soutěžit v tak odlišných věcech jako je ježdění na lyžích a střelba. O trochu větší překvapení přišlo, když jsem se dozvěděl o existenci moderního pětiboje. Díky...

popularization  cs 

Further, faster, stronger, dear AI

Česká verze příspěvku Since I was a little boy, I was astonished how weird sport biathlon is. I couldn’t imagine how could someone possible invent a combination of cross-country skying and shooting. It blew my mind when I found out there is even weirder combination...

popularization  en 

Computational Linguistics in the 21st century – a private manifesto of its perpetual student

Česká verze příspěvku In this essay, I would like to sum up my opinions on what is the role of computational linguistics, why should people concern with it and I believe are its current problems, and most importantly why it is a very exciting field...

popularization  en 

Počítačová lingvistika 21. století – soukromý manifest jejího věčného studenta

English version of the post V tomto příspěvku se pokusím zaujatě a angažovaně shrnout, co je to počítačová (chcete-li komputační nebo matematická) lingvistika, jaké jsou její současné problémy a proč je i přesto fascinujícím oborem, kterým má smysl se zabývat. Počítačová lingvistika je cosi mezi...

popularization  cs 

Deep learning a psaní I/Y

English version of the post Z deep learningu (hlubokého učení – strojového učení pomocí neuronových sítí) se v posledních letech stal buzzword technologického světa. Můžeme číst články, co dovede umělá inteligence (vyčpělé artificial intelligence se s oblibou nahrazuje pojmem machine intelligence) – jak dovede vyřešit automatický překlad,...

popularization  cs 

Spell checking of y and in Czech using deep learning

Česká verze příspěvku In the recent years, deep learning (machine learning with neural networks) became a frequently used buzzword in the technological word. We can find plenty of articles on how machine intelligence (a new, probably sexier term for artificial intelligence) can solve machine translation,...

popularization  en 

What is Neural Machine Translation Capable of?

Česká verze příspěvku A year ago at EMNLP in Lisbon I saw a paper called On Statistical Machine Translation and Translation Theory by Christian Hardmeier. He was standing in front of his poster and almost apologized to everybody who passed by his poster that the...

popularization  en 

Co dovede neuronový překlad?

English version of the post Před rokem jsem na konferenci EMNLP v Lisabonu zahlédl článek, který se jmenoval On Statistical Machine Translation and Translation Theory (O statistickém strojovém překladu a teorii překladu) od Christiana Hardmeiera. Stál před svým posterem a každému, kdo se u jeho posteru...

popularization  cs 

subscribe via RSS