The ultimate goal of AI is a world taking care of itself,
so the humans can finally become a pack of carefree, barefooted,
long-haired hippies. (← irony)
Tuesday, July 23, 2024
This post is a hindsight on two studies on multilingual sentence embeddings we published a year ago and comments on what I think people analyzing LLMs today should take away from them. In late 2022, we (which mainly was the work of Kathy Hämmerl from...
en
⏰
4.0
min
Wednesday, June 05, 2024
Here are short summaries of three pre-prints that I enjoyed reading in May. Zero-Shot Tokenizer Transfer Folks from the University of Cambridge and the Univerisity of Edinburgh propose a nice trick for changing the vocabulary of an already trained language model. They train a hyper-network...
mtml-highlights
en
⏰
2.0
min
Sunday, May 05, 2024
Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation Folks from the University of the Basque Country prepared an English-Spanish dataset for natural langauge inference (i.e., deciding if sentences follow from each other, are in contradiction, or have nothing to do with each other)...
mtml-highlights
en
⏰
2.2
min
Monday, April 08, 2024
Did Translation Models Get More Robust Without Anyone Even Noticing? Folks from Lisbon study how robust the newest MT systems are against source-side noise. Machine translation using large models, including translation-specific NLLB or via LLMs (such as Tower or GPT-3.5), is much more robust both...
mtml-highlights
en
⏰
1.6
min
Wednesday, March 06, 2024
With a new month, here are a few papers that I noticed on arXiv in February. Linear-time Minimum Bayes Risk Decoding with Reference Aggregation A preprint from the University of Zurich proposes a linear time version of Minimum Bayes Risk (MBR) decoding in machine translation....
mtml-highlights
en
⏰
2.0
min
Tuesday, February 06, 2024
Many things happened in the field in December: EMNLP, Google released Gemini, and Mixtral appeared. January was seemingly not that packed with new events, but plenty of new interesting work popped up on arXiv. Predicting Human Translation Difficulty with Neural Machine Translation Folks from the...
mtml-highlights
en
⏰
2.7
min
Tuesday, December 05, 2023
Here are a couple of articles that caught my attention in November. Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles A team from Johns Hopkins University published a pre-print that belongs to the currently trendy genre: stuff we can do with...
mtml-highlights
en
⏰
2.4
min
Friday, November 03, 2023
Here is my monthly summary of what papers on multilinguality and machine translation I found the most noteworthy during October 2023. There were 2,881 preprints in the computation and language category on arXiv (a new record number), so there is a big chance that there...
mtml-highlights
en
⏰
3.3
min
Wednesday, October 11, 2023
Here are short summaries of the papers I liked the most during the (academic) summer. Also, this time, I am posting both on GitHub pages and on Medium. mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs The preprint from the University of Würzburg presents a recipe for...
mtml-highlights
en
⏰
2.4
min
Saturday, July 08, 2023
Here are the preprints that I found the most interesting in June 2023. Exploring the Relationship between Alignment and Cross-lingual Transfer in Multilingual Transformers Folks from LORIA (a French research institute) and Posos (a French company) study the relationship between cross-lingual representation alignment and cross-lingual...
mtml-highlights
en
⏰
3.1
min
Friday, June 30, 2023
Staying up to date with the newest NLP work is a tough job, and reading about new research takes a significant amount of my time. For several years, one of my work routines has been skimming over the arXiv digest. I open a few preprints,...
en
automated-academic
⏰
4.9
min
Thursday, June 08, 2023
Here are a few papers I found most interesting in the flood of new pre-prints on arXiv. There was ACL’s camera-ready deadline and the start of the EMNLP anonymity period, so there were many more papers than usual. What is the best recipe for character-level...
mtml-highlights
en
⏰
2.9
min
Wednesday, May 03, 2023
Here is my monthly summary of what new papers and preprints are liked the most during the previous month. Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis Several institutions in China did a thorough evaluation of how large language models work for...
mtml-highlights
en
⏰
2.9
min
Thursday, April 13, 2023
As natural language processing (NLP) finds its way from university labs and becomes a crucial element of many user-facing technologies (machine translation, search, language-model-based assistants), people start to get concerned about the ethics of this technology. When people talk about NLP ethics, the main topics...
en
⏰
6.4
min
Wednesday, April 05, 2023
Here is what I found the most interesting in MT and multilinguality in March I only feature two papers (both from Microsoft, co-incidence), not because there were too few on arXiv, but because I did not manage to read that much this month. DiTTO: A...
mtml-highlights
en
⏰
2.0
min
Thursday, March 02, 2023
There were plenty of interesting pre-prints on arXiv in February. Here is a brief summary of three that I think are cool but could get lost in the hundreds of papers that went public. The unreasonable effectiveness of few-shot learning for machine translation Folks from...
mtml-highlights
en
⏰
2.1
min
Monday, February 20, 2023
Česká verze příspěvku There’s been a lot of media coverage of ChatGPT and language models lately, and I feel like not everything is being said quite right. That’s why I have prepared some questions and answers that hopefully help clarify what they are talking about....
popularization
en
⏰
10.4
min
Tuesday, February 07, 2023
English version of the post Poslední dobou se v médiích poměrně často píše o ChatGPT a jazykových modelech a mám pocit, že ne úplně všechno se říká úplně správně. Proto jsem připravil několik otázek a odpovědí, které snad pomůžou vyjasnit, o čem se to vlastně mluví....
popularization
cs
⏰
9.3
min
Monday, February 06, 2023
Here is what I found interesting on arXiv in December 2022 and January 2023. At the beginning of January, there a relatively few new pre-prints in general. But now it is catching momentum again, with more papers appearing every day. BLOOM+1: Adding Language Support to...
mtml-highlights
en
⏰
3.1
min
Thursday, January 19, 2023
In this post, I comment on our (i.e., myself, Helmut Schmid and Alex Fraser) year-old paper “Why don’t people use character-level machine translation,” published in Findings of ACL 2022. Here, I will (besides briefly summarizing the paper’s main message) mostly comment on what I learned...
en
⏰
4.0
min
Wednesday, December 21, 2022
Last week I was at EMNLP in Abu Dhabi. Besides losing my passport and figuring out what to do on such an occasion (many thanks to the personnel of the Czech embassy in Abu Dhabi), I had plenty of interesting conversations and saw many interesting...
en
⏰
4.6
min
Friday, December 02, 2022
Here are my monthly highlights from paper machine translation and multilinguality that appeared on arXiv in November 2022. A preprint with 19 authors from 13 institutions presents something like the T0 model: but instead of starting with the (more or less) monolingual T5 model, they...
mtml-highlights
en
⏰
2.1
min
Sunday, November 06, 2022
Here are my monthly highlights from paper machine translation and multilinguality that appeared on arXiv, many of them preprints from the upcoming EMNLP conference. Folks from Amazon published a pre-print that introduces a simple method of how to make pre-trained multilingual representation more robust towards...
mtml-highlights
en
⏰
2.6
min
Tuesday, October 04, 2022
Here are my monthly highlights from paper machine translation and multilinguality. A preprint from the Nara Institute of Science and Technology shows that target-language-specific fully connected layers in the Transformer decoder improve multilingual and zero-shot MT compared to the current practice of using a special...
mtml-highlights
en
⏰
2.4
min
Tuesday, September 06, 2022
There were not many papers I made notes about in August (likely because I was on vacation most of it). Anyway, here are three papers that I think should not be forgotten just because they went out in August. A paper by folks from JHU,...
mtml-highlights
en
⏰
1.2
min
Wednesday, August 03, 2022
Here is my monthly summary of what I found worth reading on arXiv in the past month. A preprint from JHU studies zero-shot cross-lingual transfer using pretrained multilingual representation and comes to the conclusion that it is an under-specified optimization problem. In other words, with...
mtml-highlights
en
⏰
2.1
min
Thursday, July 07, 2022
After a while, here is a dump of what I found most interesting on arXiv about machine translation and multilinguality, covering May and June of this year. Google Research published a pre-print of their NAACL paper: SCONES (Single-label Contrastive Objective for Non-Exclusive Sequences). The paper...
mtml-highlights
en
⏰
1.4
min
Thursday, June 02, 2022
Here are some of my notes and comments on what I had a chance to see at ACL in Dublin last week (my first in-person conference since 2019). ACL D&I 60-60 initiative ACL announced its 60-60 initiative, for the 60th birthday of ACL, all materials...
en
⏰
3.5
min
Wednesday, May 04, 2022
Another month is over, so here is my overview of what I found most interesting in machine translation and multilinguality. Rotation ciphers as regularizers A paper accepted to ACL 2022 from Simon Fraser University experiments with using rotation ciphers on the source side of MT...
mtml-highlights
en
⏰
2.3
min
Monday, April 04, 2022
Here is a monthly summary of what I found most interesting on arXiv this month from machine translation and mutlilinguality. This month was the camera-ready deadline for ACL 2022, so many of the interesting papers are accepted to ACL. Overlapping BPE When training, BPE merges...
mtml-highlights
en
⏰
2.1
min
Friday, March 04, 2022
After 100 MT Weekly posts (which took me 130 weeks to write), I realized that weekly blogging is impossible while weekly teaching. So I decided to change the format of the post and write monthly summaries of what I found most interesting in machine translation...
mtml-highlights
en
⏰
2.3
min
Sunday, January 30, 2022
This week I would like to feature a new multimodal-multilingual benchmark called IGLUE, presented in a pre-print that went out last Friday. The authors are from many place around the world: University of Copenhagen, Mila – Quebec Artificial Intelligence Institute, University of Cambridge, TU Darmstadt,...
mt-weekly
en
⏰
2.1
min
Monday, January 24, 2022
Vícejazyčné jazykové modely a technologie, které na jejich základě vznikají pomáhají zásadní mírou zpřístupňovat nástroje, které až donedávna byly dostupné pouze mluvčím velkých jazyků v bohatší části planety. Umožňují (do jisté míry) jednotně reprezentovat text v různých jazycích. Modely strojového učení trénované v jednom jazyce...
mt-weekly
en
⏰
4.8
min
Thursday, January 20, 2022
In a report published in December on arXiv, Google Deepmind tries to categorize major ethical and societal issues connected to large language models. The report probably does not say anything that was not known before, but I like the way they categorize the issues they...
mt-weekly
en
⏰
4.0
min
Sunday, January 09, 2022
By the end of the year, Meta AI (previously Facebook AI) published a pre-print introducing a multilingual version of GPT-3 called XGLM. As its title – Few-shot Learning with Multilingual Language Models – suggests, it explores the few-shot learning capabilities. The main takeaways are: Good...
mt-weekly
en
⏰
2.2
min
Sunday, December 19, 2021
Multilingual machine translation models look very promising, especially for low-resource languages that can benefit from similar patterns in similar languages. A new preprint with authors from the University of Maryland and Google Research studies how these results transfer to non-autoregressive machine translation models. The title...
mt-weekly
en
⏰
1.7
min
Saturday, December 11, 2021
I often review papers on non-autoregressive machine translation a tend the repeat the same things in my reviews. The papers often compare non-comparable things to show the non-autoregressive models in a better light. Apart from the usual flaws in MT evaluation, non-autoregressive papers often (with...
mt-weekly
en
⏰
2.5
min
Friday, December 03, 2021
This week I am returning to a topic that I follow with fascination (cf. MT Weekly #20, #61, #63, and #66) without actually doing any research myself – decoding in machine learning models. The preprint I will discuss today comes from Google Research and has...
mt-weekly
en
⏰
2.6
min
Tuesday, November 23, 2021
After the notes from EMNLP 2021, here is also an unsorted list of some observations from the Conference on Machine Translation. Facebook AI won in many translation directions (not at all in all of them) in the news task with a multilingual system. At the...
mt-weekly
en
⏰
1.6
min
Wednesday, November 17, 2021
Another big NLP conference is over and here are my notes about the paper that I liked the most. My general impression was sort of similar to what I got from ACL this year. It seems to me that the field is progressing towards some...
mt-weekly
en
⏰
4.5
min
Monday, November 01, 2021
Deep learning models are prone to so-called catastrophic forgetting when finetuned on slightly different data than they were originally trained on. Often, they also badly generalize when confronted with data that do not exactly look like those they were trained on. On the other hand,...
mt-weekly
en
⏰
1.7
min
Monday, October 25, 2021
How many times have you heard someone saying that multilingual BERT or similar models could be used as a universal encoder in machine translation? I heard that (and said that) many times, but never heard about someone who actually did that, until now. Folks from...
mt-weekly
en
⏰
2.0
min
Monday, October 18, 2021
This week, I am going to share my amazement and doubts about what could be called the surprising multilinguality of large language models. By large language models, I mean the really large ones that I can hardly run myself, trained on huge, hardly curated data...
mt-weekly
en
⏰
3.1
min
Sunday, October 10, 2021
Similar to last week, I will discuss a paper about input segmentation. The paper is not directly about machine translation or multilinguality but brings interesting insights for Transformer models in general. The title of the paper is How BPE affects memorization in Transformers, it has...
mt-weekly
en
⏰
1.7
min
Sunday, October 03, 2021
With the semester start, it is also time to renew MT Weekly. My new year’s resolution was to make it to 100 issues, so let’s see if I can keep it. Today, I will talk about a paper by my colleagues from LMU Munich that...
mt-weekly
en
⏰
2.7
min
Friday, August 13, 2021
The story of the science fiction novel Roadside Picnic by Arkady and Boris Strugatsky (mostly known via Tarkovsky’s 1979 film Stalker) takes place after an extraterrestrial event called the Visitation. Some aliens stopped by, made a roadside picnic, and left behind plenty of weird and...
mt-weekly
en
⏰
2.8
min
Saturday, July 24, 2021
Most of the papers that I comment on and review here present novel and cool ideas on how to improve something in machine translation or multilingual NLP. On the other hand, the WMT submissions are different. People want to get the best translation quality and...
mt-weekly
en
⏰
2.5
min
Sunday, July 11, 2021
This week, I will comment on a paper that quantifies and exactly measures the dimensions of the elephant in the room of machine translation: the lack of empirical support of claims in research papers on machine translation. The title of the paper is Scientific Credibility...
mt-weekly
en
⏰
2.3
min
Sunday, June 20, 2021
I tend to be a little biased against autoregressive models. The way they operate: say exactly one subword, think for a while, and then say again exactly one subword, just does not sound natural to me. Moreover, with current models, a subword can be anything...
mt-weekly
en
⏰
1.8
min
Monday, June 14, 2021
This week I will comment on two papers on zero-shot cross-lingual model transfer which do not focus on the representation quality but on the transfer itself. The title of the first one is Language Embeddings for Typology and Cross-lingual Transfer Learning and has authors from...
mt-weekly
en
⏰
2.4
min
Monday, June 07, 2021
This week I am going to discuss (and criticize) a paper on multimodal machine translation that attempts to once again evaluate if and how the visual information could be useful in machine translation. The title of the paper is Good for Misconceived Reasons: An Empirical...
mt-weekly
en
⏰
3.0
min
Monday, May 31, 2021
This week I am going to briefly comment on a paper that uses unsupervised machine translation to improve unsupervised scoring for parallel data mining. The title of the paper is Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining, it has authors from Charles University and...
mt-weekly
en
⏰
1.3
min
Sunday, May 23, 2021
At this year’s NAACL, there will be a paper that tries to view NLP from the perspective of deontological ethics and promotes an unusual and very insightful view on NLP ethics. The title of the paper is Case Study: Deontological Ethics in NLP, it was...
mt-weekly
en
⏰
3.2
min
Thursday, May 20, 2021
Automation of stuff that does not need to be automated at all is one of my most favorite procrastination activities. As an experienced (and most of the time unsuccessful) submitter to conferences organized by ACL (ACL, NAACL, EACL, EMNLP), I spent a lot of procrastinating...
en
⏰
3.7
min
Sunday, May 16, 2021
The lack of broader context is one of the main problems in machine translation and in NLP in general. People tried various methods with actually quite mixed results. A recent preprint from Unbabel introduces an unusual quantification of context-awareness and based on that do some...
mt-weekly
en
⏰
1.7
min
Sunday, May 09, 2021
This week I will comment on a preprint Cross-lingual hate speech detection based on multilingual domain-specific word embeddings by authors from the University of Chile. The pre-print evaluates the possibility of cross-lingual transfer of models for hate speech detection, i.e., training a model in one...
mt-weekly
en
⏰
1.9
min
Sunday, May 02, 2021
This week, I am will comment on a paper by authors from the University of Maryland and Google Research on reference-free evaluation of machine translation, which seems quite disturbing to me and suggests there is a lot about current MT models we still don’t quite...
mt-weekly
en
⏰
2.4
min
Sunday, April 25, 2021
Using pre-trained multilingual representation as a universal encoder for machine translation might seem like an obvious thing to try: train a decoder into one target language using one or several source languages and you get a translation from 100 languages into the target language. This...
mt-weekly
en
⏰
1.7
min
Sunday, April 18, 2021
This week, I will comment on a paper by my good old friends from Charles University in collaboration with the University of Edinburgh, the University of Sheffield, and the University of Tartu within the Bergamot project. The main goal of the project is to develop...
mt-weekly
en
⏰
2.4
min
Sunday, April 11, 2021
This week, I would like to feature three recent papers with innovations in neural architectures that I think might become important in MT and multilingual NLP during the next year. But of course, I might be wrong, in MT Weekly 27, I self-assuredly claimed that...
mt-weekly
en
⏰
2.9
min
Wednesday, March 31, 2021
Today, I will comment on a paper on non-autoregressive machine translation that shows a neat trick for increasing output fluency. The title of the paper is Non-Autoregressive Translation by Learning Target Categorical Codes, has authors from several Chinese private and public institutions and will appear...
mt-weekly
en
⏰
2.5
min
Saturday, March 20, 2021
This week, I will have a look at a pre-print that describes an unconventional setup for zero-shot machine translation. The title of the pre-print is Self-Learning for Zero-Shot Neural Machine Translation and was written by authors from the University of Trento. First of all, I...
mt-weekly
en
⏰
2.2
min
Sunday, March 14, 2021
Transformers are the neural architecture that underlies most of the current state-of-the-art machine translation and natural language processing in general. One of its major drawbacks is the quadratic complexity of the underlying self-attention mechanism, which in practice limits the sequence length that could be processed...
mt-weekly
en
⏰
4.1
min
Sunday, March 07, 2021
This week, I will have a closer look at a recent pre-print introducing an alternative for parallel data filtering for machine translation training. The title of the pre-print is Gradient-guided Loss Masking for Neural Machine Translation and comes from CMU and Google. Training data cleanness...
mt-weekly
en
⏰
2.7
min
Sunday, February 21, 2021
This week I will discuss a paper about the one-shot vocabulary learning abilities of machine translation. The title of the paper is Continuous Learning in Neural Machine Translation using Bilingual Dictionaries and will be presented at EACL in May this year. A very similar idea...
mt-weekly
en
⏰
2.5
min
Sunday, February 14, 2021
Today, I am going to comment on a paper that systematically explores something that probably many MT users do this is pre-editing (editing the source sentence) to get a better output of an MT that is treated as a black box. The title of the...
mt-weekly
en
⏰
1.8
min
Sunday, February 07, 2021
If someone told me ten years ago when I was a freshly graduated bachelor of computer science that there would models that would produce multilingual sentence representation allowing zero-shot model transfer, I would have hardly believed such a prediction. If they added that the models...
mt-weekly
en
⏰
3.0
min
Sunday, January 24, 2021
This week I am going to revisit the mystery of decoding in neural machine translation for one more time. It has been more than a year ago when Felix Stahlberg and Bill Byrne discovered the very disturbing feature of neural machine translation models – that...
mt-weekly
en
⏰
2.0
min
Sunday, January 17, 2021
Today, I am going to talk about a recent pre-print on sequence-to-sequence models for deciphering substitution ciphers. Doing such a thing was somewhere at the bottom of my todo list for a few years, I suggested it as a thesis topic to several master students...
mt-weekly
en
⏰
2.0
min
Friday, January 08, 2021
Half a year ago I featured here (MT Weekly 45) a paper that questions the contribution of non-autoregressive models to computational efficiency. It showed that a model with a deep encoder (that can be parallelized) and a shallow decoder (that works sequentially) reaches the same...
mt-weekly
en
⏰
3.7
min
Sunday, December 20, 2020
This week I will have a look at the best paper from this year’s COLING that brings an interesting view on inference in NMT models. The title of the paper is “Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine...
mt-weekly
en
⏰
2.5
min
Saturday, December 12, 2020
Papers about new models for sequence-to-sequence modeling have always been my favorite genre. This week I will talk about a model called EDITOR that was introduced in a pre-print of a paper that will appear in the TACL journal with authors from the University of...
mt-weekly
en
⏰
2.8
min
Saturday, December 05, 2020
This week I will comment on a short paper from Carnegie Mellon University and Amazon that shows a simple analysis of the diversity of machine translation outputs. The title of the paper is Decoding and Diversity in Machine Translation and it will be presented at...
mt-weekly
en
⏰
1.9
min
Sunday, November 29, 2020
This week, I will follow up the last week’s post and comment on the news from this year’s WMT that was collocated with EMNLP. As every year, there were many shared tasks on various types of translation and evaluation of machine translation. News translation task...
mt-weekly
en
⏰
2.3
min
Saturday, November 21, 2020
Another large NLP conference that must have taken place in a virtual environment, EMNLP 2020, is over, and here are my notes from the conference. The ACL in the summer that had most Q&A sessions on Zoom, which meant most of the authors waiting forever...
mt-weekly
en
⏰
7.1
min
Sunday, November 08, 2020
Today, I am going to talk about a topic that is rather unknown to me: the safety and vulnerability of machine translation. I will comment on a paper Targeted Poisoning Attacks on Black-Box Neural Machine Translation by authors from the University of Melbourne and Facebook...
mt-weekly
en
⏰
2.5
min
Sunday, November 01, 2020
This week, I am going to discuss the paper “Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation” by authors from Alibaba Group. The preprint of the paper appeared a month ago on arXiv and will be presented at this...
mt-weekly
en
⏰
2.4
min
Sunday, October 25, 2020
Last year an EMNLP paper “On NMT Search Errors and Model Errors: Cat Got Your Tongue?” (that I discussed in MT Weekly 20) showed a mindblowing property of neural machine translation models that the most probable target sentence is not necessarily the best target sentence....
mt-weekly
en
⏰
3.0
min
Saturday, October 17, 2020
This week, I am going to have a closer look at a paper that creatively uses methods for bilingual word embeddings for social media analysis. The paper’s preprint was uploaded last week on arXiv. The title is “We Don’t Speak the Same Language: Interpreting Polarization...
mt-weekly
en
⏰
2.1
min
Saturday, October 10, 2020
Článek původně vyšel v loňském prosincovém čísle časopisu Rozhledy matematicko-fyzikální. Co je to strojový překlad Pod strojovým překladem si většina lidí představí nejspíš Google Translate a většina lidí si také nejspíš vyzkoušela, jak funguje. Ten, kdo překladač používá častěji si mohl všimnout, že zhruba před...
cs
popularization
⏰
10.9
min
This week, I will discuss Nearest Neighbor Machine Translation, a paper from this year ICML that takes advantage of overlooked representation learning capabilities of machine translation models. This paper’s idea is pretty simple and is basically the same as in the previous work on nearest...
mt-weekly
en
⏰
1.8
min
Friday, October 02, 2020
After a short break, MT weekly is again here, and today I will talk about a paper “CSP: Code-Switching Pre-training for Neural Machine Translation” that will appear at this year’s virtual EMNLP. The paper proposes a new and surprisingly elegant way of monolingual pre-training for...
mt-weekly
en
⏰
2.1
min
Friday, September 11, 2020
This week I am going to have a look at a paper by my former colleagues from Prague “Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals” that was published in Nature Communications. The paper systematically studies machine translation...
mt-weekly
en
⏰
2.6
min
Thursday, September 03, 2020
Over the few years when neural models are the state of the art in machine translation, the architectures got quite standardized. There is a vocabulary of several thousand discrete input/output units. As the first step, the inputs are represented by static embeddings which get encoded...
mt-weekly
en
⏰
2.7
min
Sunday, August 30, 2020
Pre-trained multilingual representations promise to make the current best NLP model available even for low-resource languages. With a truly language-neutral pre-trained multilingual representation, we could train a task-specific model for English (or another language with available training data) and such a model would work for...
mt-weekly
en
⏰
3.4
min
Friday, August 21, 2020
It is a well-known fact that when you have a hammer, everything looks like a nail. It is a less-known fact that when you have a sequence-to-sequence model, everything looks like machine translation. One example of this thinking is the paper Paraphrase Generation as Zero-Shot...
mt-weekly
en
⏰
2.2
min
Saturday, August 15, 2020
This week, I will comment on a recent pre-print by Facebook AI titled Pre-training via Paraphrasing. The paper introduces a model called MARGE (indeed, they want to say it belongs to the same family as BART by Facebook) that uses a clever way of denoising...
mt-weekly
en
⏰
3.2
min
Friday, July 10, 2020
In this extremely long post, I will not focus on one paper as I usually do, but instead will show my brief, but still infinitely long notes from this year’s ACL. Many people already commented on the virtual format of the conference. I will spare...
mt-weekly
en
⏰
9.5
min
Friday, July 03, 2020
Back in 2013, a friend of mine enthusiastically told me, how excited he was about deep learning democratizing AI (and way saying it was not relevant for NLP at all): there was no need for large CPU clusters, all you needed was buying a gaming...
mt-weekly
en
⏰
3.7
min
Friday, June 26, 2020
Researchers concerned with machine translation speed invented several methods that are supposed to significantly speed up the translation while maintaining as much as possible from the translation quality of the state-of-the-art models. The methods are usually based on generating as many words as possible in...
mt-weekly
en
⏰
2.7
min
Friday, June 19, 2020
For quite a while, machine translation is approached as a behaviorist simulation. Don’t you know what a good translation is? It does not matter, you can just simulate what humans do. Don’t you know how to measure if something is a good translation? It does...
mt-weekly
en
⏰
2.9
min
Friday, June 12, 2020
One of the narratives people (including me) love to associate with neural machine translation is that we got rid of all linguistic assumptions about the text and let the neural network learn their own way independent of what people think about language. It sounds cool,...
mt-weekly
en
⏰
2.9
min
Saturday, June 06, 2020
Several weeks ago, I discussed a paper that showed how parallel data between two languages can be used to improve unsupervised translation between one of the two languages and a third one. This week, I will have a look at a similar idea applied in...
mt-weekly
en
⏰
2.7
min
Saturday, May 09, 2020
Recently, I came across a paper that announces a dataset release. The dataset is called PuzzLing and collects translation puzzles from the international linguistic olympiad. In machine translation jargon, I would say it provides extremely small training data to learn how to translate unknown languages....
mt-weekly
en
⏰
2.9
min
Saturday, May 02, 2020
More than half a year ago (in MT Weekly 10), I discussed massively multilingual models by Google. They managed to train two large models: one translating from 102 languages into English, the other one from English into 102 languages. This approach seemed to help a...
mt-weekly
en
⏰
1.9
min
Tuesday, April 28, 2020
English version of the post Poslední na tomto blogu většinou komentuji odborné články o strojovém překladu, které nebývají delší než deset stran. Tentokrát udělám výjimku a napíšu o knize, která je dlouhá několik set stran. Věnuje se některým důležitým společenským problémům, nad kterými by se...
cs
⏰
8.5
min
Česká verze příspěvku On this blog, I usually review papers that are usually arund 10 pages long. This time, I am going to write about a book that is several hundred pages long and discusses important issues that I believe people dealing with AI should...
en
⏰
9.6
min
Saturday, April 25, 2020
Before the Transformer architecture was invented, recurrent networks were the most prominent architectures used in machine translation and the rest of natural language processing. It is quite surprising how little we still know about the architectures from the theoretical perspective. People often repeat a claim...
mt-weekly
en
⏰
2.9
min
Saturday, April 18, 2020
In the recent week, there were quite a lot of papers on machine translation on arXiv, at least a few of them every day. Let me have a look at one that tackles an important topic – machine translation evaluation – from a quite unusual...
mt-weekly
en
⏰
3.0
min
Friday, April 10, 2020
It is sometimes fascinating to observe how each step of training neural machine translation systems gets one by one picked up by the research community, analyzed to the tiniest detail and turned into a complex recipe. Data augmentation by back-translation used to be a pretty...
mt-weekly
en
⏰
2.1
min
Saturday, April 04, 2020
This week, I am going to have a look at a topic that I was not thinking about much before: sign language translation. I will comment on a bachelor thesis from Ecole Polytechnique in Paris that was uploaded to arXiv earlier this week. It was...
mt-weekly
en
⏰
2.6
min
Friday, March 27, 2020
I always rationalized the encoder-decoder architecture as a conditional language model. The decoder is the language model, the component that “knows” the target language (whatever it means) and uses the encoder as an external memory, so it does not forget what it was talking about....
mt-weekly
en
⏰
2.2
min
Saturday, March 21, 2020
This week I am going to write a few notes on paper Echo State Neural Machine Translation by Google Research from some weeks ago. Echo state networks are a rather weird idea: initialize the parameters of a recurrent neural network randomly, keep them fixed and...
mt-weekly
en
⏰
1.6
min
Friday, March 13, 2020
This week, I will have a look at a recent pre-print that presents an interesting method for document-level machine translation that is quite different from all previous ones. The title of the paper Capturing document context inside sentence-level neural machine translation models with self-training and...
mt-weekly
en
⏰
2.0
min
Thursday, March 05, 2020
I am pretty sure everyone tried to use BERT as a machine translation encoder and who says otherwise, keeps trying. Representations from BERT brought improvement in most natural language processing tasks, why would machine translation be an exception? Well, because it is not that easy....
mt-weekly
en
⏰
2.5
min
Friday, February 28, 2020
This week, I am going to comment on a paper that appeared on arXiv on Tuesday and raised quite a lot of interest on Twitter. The title of the paper is Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation and it describes work that has...
mt-weekly
en
⏰
4.1
min
Friday, February 21, 2020
This week, it will be the third time in recent weeks when I am going to review a paper that primarily focuses on unsupervised machine translation. The title of the paper is A Multilingual View on Unsupervised Machine Translation and it describes again work done...
mt-weekly
en
⏰
5.7
min
Wednesday, February 12, 2020
This week I will have a look at a paper from last year’s EMNLP that introduces a relatively simple architecture for sequence generation when the target sequence is very similar to the source sequence. The title of the paper is “Encode, Tag, Realize: High-Precisions Text...
mt-weekly
en
⏰
4.0
min
Friday, February 07, 2020
The trend of model-pretraining and task-specific fine-tuning finally fully hit machine translation as well. After begin used for some time for unsupervised machine translation training, at the end of January Facebook published a model, a pre-trained sequence-to-sequence model for 25 languages at the same time....
mt-weekly
en
⏰
4.0
min
Friday, January 31, 2020
This week I will review a paper that is not primarily about machine translation but about a neural architecture but can make a big impact on machine translation and natural language processing in general. This post is about Google’s Reformer, a neural architecture that is...
mt-weekly
en
⏰
8.9
min
Thursday, January 23, 2020
One of the hottest topics in machine translation and one of the topics I ignored so far is unsupervised machine translation, i.e., machine translation trained without the use of any parallel data. I will go through a seven-month-old paper published at this year’s ACL titled...
mt-weekly
en
⏰
6.5
min
Thursday, January 16, 2020
Back in 2016, one of the trendy topics was reinforcement learning and other forms of optimizing NMT directly towards some more relevant metrics rather than using cross-entropy of the conditional word distributions. Standard machine translation models are trained to maximize single-word conditional distribution, which is...
mt-weekly
en
⏰
2.8
min
Thursday, January 09, 2020
After the Christmas holidays, I will once again have a look at multilingual BERT. I already discussed multilingual BERT on this blog once when I reviewed a paper that explored some cross-lingual and multilingual properties of multilingual BERT. This week’s paper does more in-depth experiments...
mt-weekly
en
⏰
3.3
min
Thursday, December 12, 2019
This week, I would like to give some thoughts about word senses and representation contextualization in machine translation. I will start by explaining why I think the current way of writing about word senses in NLP is kind of misleading and why I think we...
mt-weekly
en
⏰
4.3
min
Friday, December 06, 2019
Last week, I discussed a paper claiming that forward-translation might be a better data augmentation technique than back-translation. This week, I will follow with a paper that touches a similar topic, but in a slightly different context. The title of the paper is Understanding Knowledge Distillation in Non-Autoregressive Machine Translation and was...
mt-weekly
en
⏰
3.8
min
Thursday, November 28, 2019
Does WMT speak translationese? And who else speaks translationese? Is the success of back-translation fake news? These are the questions that implicitly ask authors of a paper called Domain, Translationese and Noise in Synthetic Data for Neural Machine Translation that was uploaded on arXiv earlier...
mt-weekly
en
⏰
3.4
min
Thursday, November 21, 2019
This week, I will have a look at a paper from this year’s EMNLP that got a lot of attention on Twitter this week. If there was an award for the most disturbing machine translation paper, this would be a good candidate. The title of...
mt-weekly
en
⏰
4.1
min
Thursday, November 14, 2019
This week, I will briefly have a look at a paper that discusses another major problem of current machine translation which is domain robustness. The problem is very well analyzed in a paper from the University of Zurich called Domain Robustness in Neural Machine Translation...
mt-weekly
en
⏰
2.4
min
Thursday, November 07, 2019
Everyone who followed natural language processing on Twitter last week must have noticed a paper called BPE-Dropout: Simple and effective Subword Regularizations that introduces a simple way of adding stochastic noise into text segmentation to increase model robustness. It sounds complicated, but it is fairly easy. As...
mt-weekly
en
⏰
2.7
min
Thursday, October 31, 2019
One of the biggest limitations of current machine translation systems is they only work with isolated sentences. The systems need to guess when it comes to phenomena that cross the (often rather arbitrary) sentence boundaries. The typical example that is mentioned everywhere is the translation...
mt-weekly
en
⏰
4.0
min
Wednesday, October 23, 2019
One of the topics I am currently dealing with in my research is character-level modeling for neural machine translation. Therefore, I was glad to see a paper that appeared on arXiv last week called On the Importance of Word Boundaries in Character-level Neural Machine Translation that shows an interesting...
mt-weekly
en
⏰
5.3
min
Friday, October 18, 2019
This week, I will slightly depart from machine translation and have a look at a paper How Multilingual is Multilingual BERT by Google Research. BERT, the Sesame Street muppet that recently colonized the whole area of natural language processing is a model trained to predict...
mt-weekly
en
⏰
4.0
min
Thursday, October 10, 2019
Neural machine translation is based on machine learning—we collect training data, pairs of parallel sentences which we hope represent how language is used in the two languages, and train models using the data. When the model is trained, the more the input resembles the sentences...
mt-weekly
en
⏰
2.8
min
Wednesday, October 02, 2019
Let us follow up on the gender paper and have a look at other cases where machine translation does not work as well as we would like it to work. This time, we will have a look at a paper that talks about grammatically complex...
mt-weekly
en
⏰
3.4
min
Thursday, September 26, 2019
Five years ago when deep learning slowly started to be cool, there was a paper called Neural Turing Machines (which are not really Turing machines, but at least they are neural in a narrow technical sense). The paper left me with a foolishly naive impression...
mt-weekly
en
⏰
3.0
min
Wednesday, September 18, 2019
It’s time to talk about gender—why things go wrong with gender in machine translation and what people do about it. Some languages have gendered nouns (German), some have gendered almost everything (Czech, French) and some only few pronouns (English). Let’s say you want to translate...
mt-weekly
en
⏰
2.9
min
Wednesday, September 11, 2019
The holiday period is over and I almost settled in my new place of operation which is the Ludwig-Maximilian University of Munich, and now there is nothing that can prevent me from continuing with weekly reports on what is new in the world of machine...
mt-weekly
en
⏰
2.7
min
Wednesday, July 10, 2019
Machine translation is typically trained on bilingual data that can be found on the Internet. It mostly comes from international government and non-government organizations, commercial web presentations, books, movie subtitles, etc. Therefore, most of the text is quite formal and almost without typos and certainly...
mt-weekly
en
⏰
3.0
min
Thursday, July 04, 2019
This week’s post contains more math than usually. I will talk about a paper that unifies several decoding algorithms in MT using one simple equation. The paper is called A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models, it comes from New...
mt-weekly
en
⏰
4.8
min
Monday, June 24, 2019
Remember two years ago when all tech-related servers enthusiastically reported that a translator by Google created its own language? These were based on a paper that was published in TAACL in summer 2017 after its pre-print was available on arXiv since November 2016. The paper...
mt-weekly
en
⏰
3.6
min
Tuesday, June 11, 2019
This week, we will have a look at a paper that won the best short paper award at NAACL 2019. The name of the paper is Probing the Need for Visual Context in Multimodal Machine Translation and it was written by friends of mine from...
mt-weekly
en
⏰
2.5
min
Tuesday, June 04, 2019
Let’s continue with pre-prints of papers which are going to appear at ACL this year and have a look at another paper that comes from the University of Edinburgh, titled Revisiting Low-Resource Neural Machine Translation: A Case Study. This paper is a reaction to an...
mt-weekly
en
⏰
2.2
min
Monday, May 27, 2019
With the ACL camera-ready deadline slowly approaching, future ACL papers start to pop up on arXiv. One of those which went public just a few days ago is a paper called Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned...
mt-weekly
en
⏰
4.2
min
Friday, May 17, 2019
This week, we will have a look at a brand-new method for non-autoregressive machine translation published a few weeks ago on arXiv by Facebook AI Research, two days before the anonymity period for the EMNLP conference. Most models for neural machine translation work autoregressively. When...
mt-weekly
en
⏰
3.0
min
Wednesday, May 01, 2019
Last week, there was a paper on arXiv that introduces a method for MT evaluation using BERT sentence representation. The metric seems to be the new state of the art in MT evaluation. Its name is BERTScore and was done at Cornell University. MT evaluation...
mt-weekly
en
⏰
2.9
min
Tuesday, April 23, 2019
This is the first post from a series in which I will try to come up with summaries of some of the latest papers and other news on machine translation. The main goal of this exercise is to force myself to read new papers regularly...
mt-weekly
en
⏰
3.1
min
Tuesday, November 21, 2017
English version of the post Když si v médiích čtu o technologiích, které využívají strojové učení, a o umělé inteligenci, jak se teď říká, často se divím, jak jsou zprávy nejen zjednodušující, ale především zavádějící. Určitě se to děje ve všech oborech, nicméně rozhořčování se...
popularization
cs
⏰
5.4
min
Česká verze příspěvku While reading news stories on research or products involving deep learning, I get often surprised how inaccurate and misleading the news stories are. It is probably a problem of almost all expert fields which happen to appear in media, luckily they do...
popularization
en
⏰
5.9
min
Monday, May 29, 2017
English version of the post Odmala jsem si myslel, že biatlon je divný sport. Vrtalo mi hlavou, jak někoho napadlo soutěžit v tak odlišných věcech jako je ježdění na lyžích a střelba. O trochu větší překvapení přišlo, když jsem se dozvěděl o existenci moderního pětiboje. Díky...
popularization
cs
⏰
9.1
min
Česká verze příspěvku Since I was a little boy, I was astonished how weird sport biathlon is. I couldn’t imagine how could someone possible invent a combination of cross-country skying and shooting. It blew my mind when I found out there is even weirder combination...
popularization
en
⏰
6.8
min
Monday, March 20, 2017
Česká verze příspěvku In this essay, I would like to sum up my opinions on what is the role of computational linguistics, why should people concern with it and I believe are its current problems, and most importantly why it is a very exciting field...
popularization
en
⏰
8.2
min
English version of the post V tomto příspěvku se pokusím zaujatě a angažovaně shrnout, co je to počítačová (chcete-li komputační nebo matematická) lingvistika, jaké jsou její současné problémy a proč je i přesto fascinujícím oborem, kterým má smysl se zabývat. Počítačová lingvistika je cosi mezi...
popularization
cs
⏰
7.3
min
Wednesday, February 22, 2017
English version of the post Z deep learningu (hlubokého učení – strojového učení pomocí neuronových sítí) se v posledních letech stal buzzword technologického světa. Můžeme číst články, co dovede umělá inteligence (vyčpělé artificial intelligence se s oblibou nahrazuje pojmem machine intelligence) – jak dovede vyřešit automatický překlad,...
popularization
cs
⏰
10.1
min
Česká verze příspěvku In the recent years, deep learning (machine learning with neural networks) became a frequently used buzzword in the technological word. We can find plenty of articles on how machine intelligence (a new, probably sexier term for artificial intelligence) can solve machine translation,...
popularization
en
⏰
9.9
min
Tuesday, November 29, 2016
Česká verze příspěvku A year ago at EMNLP in Lisbon I saw a paper called On Statistical Machine Translation and Translation Theory by Christian Hardmeier. He was standing in front of his poster and almost apologized to everybody who passed by his poster that the...
popularization
en
⏰
4.7
min
English version of the post Před rokem jsem na konferenci EMNLP v Lisabonu zahlédl článek, který se jmenoval On Statistical Machine Translation and Translation Theory (O statistickém strojovém překladu a teorii překladu) od Christiana Hardmeiera. Stál před svým posterem a každému, kdo se u jeho posteru...
popularization
cs
⏰
4.2
min