The ultimate goal of AI is a world taking care of itself,
so the humans can finally become a pack of carefree, barefooted,
long-haired hippies. (← irony)
Sunday, February 21, 2021
This week I will discuss a paper about the one-shot vocabulary learning abilities of machine translation. The title of the paper is Continuous Learning in Neural Machine Translation using Bilingual Dictionaries and will be presented at EACL in May this year. A very similar idea...
mt-weekly
en
⏰
2.5
min
Sunday, February 14, 2021
Today, I am going to comment on a paper that systematically explores something that probably many MT users do this is pre-editing (editing the source sentence) to get a better output of an MT that is treated as a black box. The title of the...
mt-weekly
en
⏰
1.8
min
Sunday, February 07, 2021
If someone told me ten years ago when I was a freshly graduated bachelor of computer science that there would models that would produce multilingual sentence representation allowing zero-shot model transfer, I would have hardly believed such a prediction. If they added that the models...
mt-weekly
en
⏰
3.0
min
Sunday, January 24, 2021
This week I am going to revisit the mystery of decoding in neural machine translation for one more time. It has been more than a year ago when Felix Stahlberg and Bill Byrne discovered the very disturbing feature of neural machine translation models – that...
mt-weekly
en
⏰
2.0
min
Sunday, January 17, 2021
Today, I am going to talk about a recent pre-print on sequence-to-sequence models for deciphering substitution ciphers. Doing such a thing was somewhere at the bottom of my todo list for a few years, I suggested it as a thesis topic to several master students...
mt-weekly
en
⏰
2.0
min
Friday, January 08, 2021
Half a year ago I featured here (MT Weekly 45) a paper that questions the contribution of non-autoregressive models to computational efficiency. It showed that a model with a deep encoder (that can be parallelized) and a shallow decoder (that works sequentially) reaches the same...
mt-weekly
en
⏰
3.7
min
Sunday, December 20, 2020
This week I will have a look at the best paper from this year’s COLING that brings an interesting view on inference in NMT models. The title of the paper is “Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine...
mt-weekly
en
⏰
2.5
min
Saturday, December 12, 2020
Papers about new models for sequence-to-sequence modeling have always been my favorite genre. This week I will talk about a model called EDITOR that was introduced in a pre-print of a paper that will appear in the TACL journal with authors from the University of...
mt-weekly
en
⏰
2.8
min
Saturday, December 05, 2020
This week I will comment on a short paper from Carnegie Mellon University and Amazon that shows a simple analysis of the diversity of machine translation outputs. The title of the paper is Decoding and Diversity in Machine Translation and it will be presented at...
mt-weekly
en
⏰
1.9
min
Sunday, November 29, 2020
This week, I will follow up the last week’s post and comment on the news from this year’s WMT that was collocated with EMNLP. As every year, there were many shared tasks on various types of translation and evaluation of machine translation. News translation task...
mt-weekly
en
⏰
2.3
min
Saturday, November 21, 2020
Another large NLP conference that must have taken place in a virtual environment, EMNLP 2020, is over, and here are my notes from the conference. The ACL in the summer that had most Q&A sessions on Zoom, which meant most of the authors waiting forever...
mt-weekly
en
⏰
7.1
min
Sunday, November 08, 2020
Today, I am going to talk about a topic that is rather unknown to me: the safety and vulnerability of machine translation. I will comment on a paper Targeted Poisoning Attacks on Black-Box Neural Machine Translation by authors from the University of Melbourne and Facebook...
mt-weekly
en
⏰
2.5
min
Sunday, November 01, 2020
This week, I am going to discuss the paper “Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation” by authors from Alibaba Group. The preprint of the paper appeared a month ago on arXiv and will be presented at this...
mt-weekly
en
⏰
2.4
min
Sunday, October 25, 2020
Last year an EMNLP paper “On NMT Search Errors and Model Errors: Cat Got Your Tongue?” (that I discussed in MT Weekly 20) showed a mindblowing property of neural machine translation models that the most probable target sentence is not necessarily the best target sentence....
mt-weekly
en
⏰
3.0
min
Saturday, October 17, 2020
This week, I am going to have a closer look at a paper that creatively uses methods for bilingual word embeddings for social media analysis. The paper’s preprint was uploaded last week on arXiv. The title is “We Don’t Speak the Same Language: Interpreting Polarization...
mt-weekly
en
⏰
2.1
min
Saturday, October 10, 2020
Článek původně vyšel v loňském prosincovém čísle časopisu Rozhledy matematicko-fyzikální. Co je to strojový překlad Pod strojovým překladem si většina lidí představí nejspíš Google Translate a většina lidí si také nejspíš vyzkoušela, jak funguje. Ten, kdo překladač používá častěji si mohl všimnout, že zhruba před...
cs
popularization
⏰
10.9
min
This week, I will discuss Nearest Neighbor Machine Translation, a paper from this year ICML that takes advantage of overlooked representation learning capabilities of machine translation models. This paper’s idea is pretty simple and is basically the same as in the previous work on nearest...
mt-weekly
en
⏰
1.8
min
Friday, October 02, 2020
After a short break, MT weekly is again here, and today I will talk about a paper “CSP: Code-Switching Pre-training for Neural Machine Translation” that will appear at this year’s virtual EMNLP. The paper proposes a new and surprisingly elegant way of monolingual pre-training for...
mt-weekly
en
⏰
2.1
min
Friday, September 11, 2020
This week I am going to have a look at a paper by my former colleagues from Prague “Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals” that was published in Nature Communications. The paper systematically studies machine translation...
mt-weekly
en
⏰
2.6
min
Thursday, September 03, 2020
Over the few years when neural models are the state of the art in machine translation, the architectures got quite standardized. There is a vocabulary of several thousand discrete input/output units. As the first step, the inputs are represented by static embeddings which get encoded...
mt-weekly
en
⏰
2.7
min
Sunday, August 30, 2020
Pre-trained multilingual representations promise to make the current best NLP model available even for low-resource languages. With a truly language-neutral pre-trained multilingual representation, we could train a task-specific model for English (or another language with available training data) and such a model would work for...
mt-weekly
en
⏰
3.4
min
Friday, August 21, 2020
It is a well-known fact that when you have a hammer, everything looks like a nail. It is a less-known fact that when you have a sequence-to-sequence model, everything looks like machine translation. One example of this thinking is the paper Paraphrase Generation as Zero-Shot...
mt-weekly
en
⏰
2.2
min
Saturday, August 15, 2020
This week, I will comment on a recent pre-print by Facebook AI titled Pre-training via Paraphrasing. The paper introduces a model called MARGE (indeed, they want to say it belongs to the same family as BART by Facebook) that uses a clever way of denoising...
mt-weekly
en
⏰
3.2
min
Friday, July 10, 2020
In this extremely long post, I will not focus on one paper as I usually do, but instead will show my brief, but still infinitely long notes from this year’s ACL. Many people already commented on the virtual format of the conference. I will spare...
mt-weekly
en
⏰
9.5
min
Friday, July 03, 2020
Back in 2013, a friend of mine enthusiastically told me, how excited he was about deep learning democratizing AI (and way saying it was not relevant for NLP at all): there was no need for large CPU clusters, all you needed was buying a gaming...
mt-weekly
en
⏰
3.7
min
Friday, June 26, 2020
Researchers concerned with machine translation speed invented several methods that are supposed to significantly speed up the translation while maintaining as much as possible from the translation quality of the state-of-the-art models. The methods are usually based on generating as many words as possible in...
mt-weekly
en
⏰
2.7
min
Friday, June 19, 2020
For quite a while, machine translation is approached as a behaviorist simulation. Don’t you know what a good translation is? It does not matter, you can just simulate what humans do. Don’t you know how to measure if something is a good translation? It does...
mt-weekly
en
⏰
2.9
min
Friday, June 12, 2020
One of the narratives people (including me) love to associate with neural machine translation is that we got rid of all linguistic assumptions about the text and let the neural network learn their own way independent of what people think about language. It sounds cool,...
mt-weekly
en
⏰
2.9
min
Saturday, June 06, 2020
Several weeks ago, I discussed a paper that showed how parallel data between two languages can be used to improve unsupervised translation between one of the two languages and a third one. This week, I will have a look at a similar idea applied in...
mt-weekly
en
⏰
2.7
min
Saturday, May 09, 2020
Recently, I came across a paper that announces a dataset release. The dataset is called PuzzLing and collects translation puzzles from the international linguistic olympiad. In machine translation jargon, I would say it provides extremely small training data to learn how to translate unknown languages....
mt-weekly
en
⏰
2.9
min
Saturday, May 02, 2020
More than half a year ago (in MT Weekly 10), I discussed massively multilingual models by Google. They managed to train two large models: one translating from 102 languages into English, the other one from English into 102 languages. This approach seemed to help a...
mt-weekly
en
⏰
1.9
min
Tuesday, April 28, 2020
English version of the post Poslední na tomto blogu většinou komentuji odborné články o strojovém překladu, které nebývají delší než deset stran. Tentokrát udělám výjimku a napíšu o knize, která je dlouhá několik set stran. Věnuje se některým důležitým společenským problémům, nad kterými by se...
cs
⏰
8.5
min
Česká verze příspěvku On this blog, I usually review papers that are usually arund 10 pages long. This time, I am going to write about a book that is several hundred pages long and discusses important issues that I believe people dealing with AI should...
en
⏰
9.6
min
Saturday, April 25, 2020
Before the Transformer architecture was invented, recurrent networks were the most prominent architectures used in machine translation and the rest of natural language processing. It is quite surprising how little we still know about the architectures from the theoretical perspective. People often repeat a claim...
mt-weekly
en
⏰
2.9
min
Saturday, April 18, 2020
In the recent week, there were quite a lot of papers on machine translation on arXiv, at least a few of them every day. Let me have a look at one that tackles an important topic – machine translation evaluation – from a quite unusual...
mt-weekly
en
⏰
3.0
min
Friday, April 10, 2020
It is sometimes fascinating to observe how each step of training neural machine translation systems gets one by one picked up by the research community, analyzed to the tiniest detail and turned into a complex recipe. Data augmentation by back-translation used to be a pretty...
mt-weekly
en
⏰
2.1
min
Saturday, April 04, 2020
This week, I am going to have a look at a topic that I was not thinking about much before: sign language translation. I will comment on a bachelor thesis from Ecole Polytechnique in Paris that was uploaded to arXiv earlier this week. It was...
mt-weekly
en
⏰
2.6
min
Friday, March 27, 2020
I always rationalized the encoder-decoder architecture as a conditional language model. The decoder is the language model, the component that “knows” the target language (whatever it means) and uses the encoder as an external memory, so it does not forget what it was talking about....
mt-weekly
en
⏰
2.2
min
Saturday, March 21, 2020
This week I am going to write a few notes on paper Echo State Neural Machine Translation by Google Research from some weeks ago. Echo state networks are a rather weird idea: initialize the parameters of a recurrent neural network randomly, keep them fixed and...
mt-weekly
en
⏰
1.6
min
Friday, March 13, 2020
This week, I will have a look at a recent pre-print that presents an interesting method for document-level machine translation that is quite different from all previous ones. The title of the paper Capturing document context inside sentence-level neural machine translation models with self-training and...
mt-weekly
en
⏰
2.0
min
Thursday, March 05, 2020
I am pretty sure everyone tried to use BERT as a machine translation encoder and who says otherwise, keeps trying. Representations from BERT brought improvement in most natural language processing tasks, why would machine translation be an exception? Well, because it is not that easy....
mt-weekly
en
⏰
2.5
min
Friday, February 28, 2020
This week, I am going to comment on a paper that appeared on arXiv on Tuesday and raised quite a lot of interest on Twitter. The title of the paper is Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation and it describes work that has...
mt-weekly
en
⏰
4.1
min
Friday, February 21, 2020
This week, it will be the third time in recent weeks when I am going to review a paper that primarily focuses on unsupervised machine translation. The title of the paper is A Multilingual View on Unsupervised Machine Translation and it describes again work done...
mt-weekly
en
⏰
5.7
min
Wednesday, February 12, 2020
This week I will have a look at a paper from last year’s EMNLP that introduces a relatively simple architecture for sequence generation when the target sequence is very similar to the source sequence. The title of the paper is “Encode, Tag, Realize: High-Precisions Text...
mt-weekly
en
⏰
4.0
min
Friday, February 07, 2020
The trend of model-pretraining and task-specific fine-tuning finally fully hit machine translation as well. After begin used for some time for unsupervised machine translation training, at the end of January Facebook published a model, a pre-trained sequence-to-sequence model for 25 languages at the same time....
mt-weekly
en
⏰
4.0
min
Friday, January 31, 2020
This week I will review a paper that is not primarily about machine translation but about a neural architecture but can make a big impact on machine translation and natural language processing in general. This post is about Google’s Reformer, a neural architecture that is...
mt-weekly
en
⏰
8.9
min
Thursday, January 23, 2020
One of the hottest topics in machine translation and one of the topics I ignored so far is unsupervised machine translation, i.e., machine translation trained without the use of any parallel data. I will go through a seven-month-old paper published at this year’s ACL titled...
mt-weekly
en
⏰
6.5
min
Thursday, January 16, 2020
Back in 2016, one of the trendy topics was reinforcement learning and other forms of optimizing NMT directly towards some more relevant metrics rather than using cross-entropy of the conditional word distributions. Standard machine translation models are trained to maximize single-word conditional distribution, which is...
mt-weekly
en
⏰
2.8
min
Thursday, January 09, 2020
After the Christmas holidays, I will once again have a look at multilingual BERT. I already discussed multilingual BERT on this blog once when I reviewed a paper that explored some cross-lingual and multilingual properties of multilingual BERT. This week’s paper does more in-depth experiments...
mt-weekly
en
⏰
3.3
min
Thursday, December 12, 2019
This week, I would like to give some thoughts about word senses and representation contextualization in machine translation. I will start by explaining why I think the current way of writing about word senses in NLP is kind of misleading and why I think we...
mt-weekly
en
⏰
4.3
min
Friday, December 06, 2019
Last week, I discussed a paper claiming that forward-translation might be a better data augmentation technique than back-translation. This week, I will follow with a paper that touches a similar topic, but in a slightly different context. The title of the paper is Understanding Knowledge Distillation in Non-Autoregressive Machine Translation and was...
mt-weekly
en
⏰
3.8
min
Thursday, November 28, 2019
Does WMT speak translationese? And who else speaks translationese? Is the success of back-translation fake news? These are the questions that implicitly ask authors of a paper called Domain, Translationese and Noise in Synthetic Data for Neural Machine Translation that was uploaded on arXiv earlier...
mt-weekly
en
⏰
3.4
min
Thursday, November 21, 2019
This week, I will have a look at a paper from this year’s EMNLP that got a lot of attention on Twitter this week. If there was an award for the most disturbing machine translation paper, this would be a good candidate. The title of...
mt-weekly
en
⏰
4.1
min
Thursday, November 14, 2019
This week, I will briefly have a look at a paper that discusses another major problem of current machine translation which is domain robustness. The problem is very well analyzed in a paper from the University of Zurich called Domain Robustness in Neural Machine Translation...
mt-weekly
en
⏰
2.4
min
Thursday, November 07, 2019
Everyone who followed natural language processing on Twitter last week must have noticed a paper called BPE-Dropout: Simple and effective Subword Regularizations that introduces a simple way of adding stochastic noise into text segmentation to increase model robustness. It sounds complicated, but it is fairly easy. As...
mt-weekly
en
⏰
2.7
min
Thursday, October 31, 2019
One of the biggest limitations of current machine translation systems is they only work with isolated sentences. The systems need to guess when it comes to phenomena that cross the (often rather arbitrary) sentence boundaries. The typical example that is mentioned everywhere is the translation...
mt-weekly
en
⏰
4.0
min
Wednesday, October 23, 2019
One of the topics I am currently dealing with in my research is character-level modeling for neural machine translation. Therefore, I was glad to see a paper that appeared on arXiv last week called On the Importance of Word Boundaries in Character-level Neural Machine Translation that shows an interesting...
mt-weekly
en
⏰
5.3
min
Friday, October 18, 2019
This week, I will slightly depart from machine translation and have a look at a paper How Multilingual is Multilingual BERT by Google Research. BERT, the Sesame Street muppet that recently colonized the whole area of natural language processing is a model trained to predict...
mt-weekly
en
⏰
4.0
min
Thursday, October 10, 2019
Neural machine translation is based on machine learning—we collect training data, pairs of parallel sentences which we hope represent how language is used in the two languages, and train models using the data. When the model is trained, the more the input resembles the sentences...
mt-weekly
en
⏰
2.8
min
Wednesday, October 02, 2019
Let us follow up on the gender paper and have a look at other cases where machine translation does not work as well as we would like it to work. This time, we will have a look at a paper that talks about grammatically complex...
mt-weekly
en
⏰
3.4
min
Thursday, September 26, 2019
Five years ago when deep learning slowly started to be cool, there was a paper called Neural Turing Machines (which are not really Turing machines, but at least they are neural in a narrow technical sense). The paper left me with a foolishly naive impression...
mt-weekly
en
⏰
3.0
min
Wednesday, September 18, 2019
It’s time to talk about gender—why things go wrong with gender in machine translation and what people do about it. Some languages have gendered nouns (German), some have gendered almost everything (Czech, French) and some only few pronouns (English). Let’s say you want to translate...
mt-weekly
en
⏰
2.9
min
Wednesday, September 11, 2019
The holiday period is over and I almost settled in my new place of operation which is the Ludwig-Maximilian University of Munich, and now there is nothing that can prevent me from continuing with weekly reports on what is new in the world of machine...
mt-weekly
en
⏰
2.7
min
Wednesday, July 10, 2019
Machine translation is typically trained on bilingual data that can be found on the Internet. It mostly comes from international government and non-government organizations, commercial web presentations, books, movie subtitles, etc. Therefore, most of the text is quite formal and almost without typos and certainly...
mt-weekly
en
⏰
3.0
min
Thursday, July 04, 2019
This week’s post contains more math than usually. I will talk about a paper that unifies several decoding algorithms in MT using one simple equation. The paper is called A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models, it comes from New...
mt-weekly
en
⏰
4.8
min
Monday, June 24, 2019
Remember two years ago when all tech-related servers enthusiastically reported that a translator by Google created its own language? These were based on a paper that was published in TAACL in summer 2017 after its pre-print was available on arXiv since November 2016. The paper...
mt-weekly
en
⏰
3.6
min
Tuesday, June 11, 2019
This week, we will have a look at a paper that won the best short paper award at NAACL 2019. The name of the paper is Probing the Need for Visual Context in Multimodal Machine Translation and it was written by friends of mine from...
mt-weekly
en
⏰
2.5
min
Tuesday, June 04, 2019
Let’s continue with pre-prints of papers which are going to appear at ACL this year and have a look at another paper that comes from the University of Edinburgh, titled Revisiting Low-Resource Neural Machine Translation: A Case Study. This paper is a reaction to an...
mt-weekly
en
⏰
2.2
min
Monday, May 27, 2019
With the ACL camera-ready deadline slowly approaching, future ACL papers start to pop up on arXiv. One of those which went public just a few days ago is a paper called Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned...
mt-weekly
en
⏰
4.2
min
Friday, May 17, 2019
This week, we will have a look at a brand-new method for non-autoregressive machine translation published a few weeks ago on arXiv by Facebook AI Research, two days before the anonymity period for the EMNLP conference. Most models for neural machine translation work autoregressively. When...
mt-weekly
en
⏰
3.0
min
Wednesday, May 01, 2019
Last week, there was a paper on arXiv that introduces a method for MT evaluation using BERT sentence representation. The metric seems to be the new state of the art in MT evaluation. Its name is BERTScore and was done at Cornell University. MT evaluation...
mt-weekly
en
⏰
2.9
min
Tuesday, April 23, 2019
This is the first post from a series in which I will try to come up with summaries of some of the latest papers and other news on machine translation. The main goal of this exercise is to force myself to read new papers regularly...
mt-weekly
en
⏰
3.1
min
Tuesday, November 21, 2017
English version of the post Když si v médiích čtu o technologiích, které využívají strojové učení, a o umělé inteligenci, jak se teď říká, často se divím, jak jsou zprávy nejen zjednodušující, ale především zavádějící. Určitě se to děje ve všech oborech, nicméně rozhořčování se...
popularization
cs
⏰
5.4
min
Česká verze příspěvku While reading news stories on research or products involving deep learning, I get often surprised how inaccurate and misleading the news stories are. It is probably a problem of almost all expert fields which happen to appear in media, luckily they do...
popularization
en
⏰
5.9
min
Monday, May 29, 2017
English version of the post Odmala jsem si myslel, že biatlon je divný sport. Vrtalo mi hlavou, jak někoho napadlo soutěžit v tak odlišných věcech jako je ježdění na lyžích a střelba. O trochu větší překvapení přišlo, když jsem se dozvěděl o existenci moderního pětiboje. Díky...
popularization
cs
⏰
9.1
min
Česká verze příspěvku Since I was a little boy, I was astonished how weird sport biathlon is. I couldn’t imagine how could someone possible invent a combination of cross-country skying and shooting. It blew my mind when I found out there is even weirder combination...
popularization
en
⏰
6.8
min
Monday, March 20, 2017
Česká verze příspěvku In this essay, I would like to sum up my opinions on what is the role of computational linguistics, why should people concern with it and I believe are its current problems, and most importantly why it is a very exciting field...
popularization
en
⏰
8.2
min
English version of the post V tomto příspěvku se pokusím zaujatě a angažovaně shrnout, co je to počítačová (chcete-li komputační nebo matematická) lingvistika, jaké jsou její současné problémy a proč je i přesto fascinujícím oborem, kterým má smysl se zabývat. Počítačová lingvistika je cosi mezi...
popularization
cs
⏰
7.3
min
Wednesday, February 22, 2017
English version of the post Z deep learningu (hlubokého učení – strojového učení pomocí neuronových sítí) se v posledních letech stal buzzword technologického světa. Můžeme číst články, co dovede umělá inteligence (vyčpělé artificial intelligence se s oblibou nahrazuje pojmem machine intelligence) – jak dovede vyřešit automatický překlad,...
popularization
cs
⏰
10.1
min
Česká verze příspěvku In the recent years, deep learning (machine learning with neural networks) became a frequently used buzzword in the technological word. We can find plenty of articles on how machine intelligence (a new, probably sexier term for artificial intelligence) can solve machine translation,...
popularization
en
⏰
9.9
min
Tuesday, November 29, 2016
Česká verze příspěvku A year ago at EMNLP in Lisbon I saw a paper called On Statistical Machine Translation and Translation Theory by Christian Hardmeier. He was standing in front of his poster and almost apologized to everybody who passed by his poster that the...
popularization
en
⏰
4.7
min
English version of the post Před rokem jsem na konferenci EMNLP v Lisabonu zahlédl článek, který se jmenoval On Statistical Machine Translation and Translation Theory (O statistickém strojovém překladu a teorii překladu) od Christiana Hardmeiera. Stál před svým posterem a každému, kdo se u jeho posteru...
popularization
cs
⏰
4.2
min