The ultimate goal of AI is a world taking care of itself, so the humans can finally become a pack of carefree, barefooted, long-haired hippies. (← irony)

Posts

Machine Translation Weekly 52: Human Parity in Machine Translation

This week I am going to have a look at a paper by my former colleagues from Prague “Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals” that was published in Nature Communications. The paper systematically studies machine translation...

mt-weekly  en  ⏰ 2.6 min

Machine Translation Weekly 51: Machine Translation without Embeddings

Over the few years when neural models are the state of the art in machine translation, the architectures got quite standardized. There is a vocabulary of several thousand discrete input/output units. As the first step, the inputs are represented by static embeddings which get encoded...

mt-weekly  en  ⏰ 2.7 min

Machine Translation Weekly 50: Language-Agnostic Multilingual Representations

Pre-trained multilingual representations promise to make the current best NLP model available even for low-resource languages. With a truly language-neutral pre-trained multilingual representation, we could train a task-specific model for English (or another language with available training data) and such a model would work for...

mt-weekly  en  ⏰ 3.4 min

Machine Translation Weekly 49: Paraphrasing using multilingual MT

It is a well-known fact that when you have a hammer, everything looks like a nail. It is a less-known fact that when you have a sequence-to-sequence model, everything looks like machine translation. One example of this thinking is the paper Paraphrase Generation as Zero-Shot...

mt-weekly  en  ⏰ 2.2 min

Machine Translation Weekly 48: MARGE

This week, I will comment on a recent pre-print by Facebook AI titled Pre-training via Paraphrasing. The paper introduces a model called MARGE (indeed, they want to say it belongs to the same family as BART by Facebook) that uses a clever way of denoising...

mt-weekly  en  ⏰ 3.2 min

Machine Translation Weekly 47: Notes from the ACL

In this extremely long post, I will not focus on one paper as I usually do, but instead will show my brief, but still infinitely long notes from this year’s ACL. Many people already commented on the virtual format of the conference. I will spare...

mt-weekly  en  ⏰ 9.5 min

Machine Translation Weekly 46: The News GPT-3 has for Machine Translation

Back in 2013, a friend of mine enthusiastically told me, how excited he was about deep learning democratizing AI (and way saying it was not relevant for NLP at all): there was no need for large CPU clusters, all you needed was buying a gaming...

mt-weekly  en  ⏰ 3.7 min

Machine Translation Weekly 45: Deep Encoder, Shallow Decoder, and the Fall of Non-autoregressive models

Researchers concerned with machine translation speed invented several methods that are supposed to significantly speed up the translation while maintaining as much as possible from the translation quality of the state-of-the-art models. The methods are usually based on generating as many words as possible in...

mt-weekly  en  ⏰ 2.7 min

Machine Translation Weekly 44: Tangled up in BLEU (and not blue)

For quite a while, machine translation is approached as a behaviorist simulation. Don’t you know what a good translation is? It does not matter, you can just simulate what humans do. Don’t you know how to measure if something is a good translation? It does...

mt-weekly  en  ⏰ 2.9 min

Machine Translation Weekly 43: Dynamic Programming Encoding

One of the narratives people (including me) love to associate with neural machine translation is that we got rid of all linguistic assumptions about the text and let the neural network learn their own way independent of what people think about language. It sounds cool,...

mt-weekly  en  ⏰ 2.9 min

Machine Translation Weekly 42: Unsupervised Multimodal Machine Translation

Several weeks ago, I discussed a paper that showed how parallel data between two languages can be used to improve unsupervised translation between one of the two languages and a third one. This week, I will have a look at a similar idea applied in...

mt-weekly  en  ⏰ 2.7 min

Machine Translation Weekly 41: Translating Fast and Slow

Recently, I came across a paper that announces a dataset release. The dataset is called PuzzLing and collects translation puzzles from the international linguistic olympiad. In machine translation jargon, I would say it provides extremely small training data to learn how to translate unknown languages....

mt-weekly  en  ⏰ 2.9 min

Machine Translation Weekly 40: Getting Massively Multilingual Again

More than half a year ago (in MT Weekly 10), I discussed massively multilingual models by Google. They managed to train two large models: one translating from 102 languages into English, the other one from English into 102 languages. This approach seemed to help a...

mt-weekly  en  ⏰ 1.9 min

O datovém kolonializmu

English version of the post Poslední na tomto blogu většinou komentuji odborné články o strojovém překladu, které nebývají delší než deset stran. Tentokrát udělám výjimku a napíšu o knize, která je dlouhá několik set stran. Věnuje se některým důležitým společenským problémům, nad kterými by se...

cs  ⏰ 8.5 min

On Data Colonialism

Česká verze příspěvku On this blog, I usually review papers that are usually arund 10 pages long. This time, I am going to write about a book that is several hundred pages long and discusses important issues that I believe people dealing with AI should...

en  ⏰ 9.6 min

Machine Translation Weekly 39: Formal Hierarchy of Recurrent Architectures

Before the Transformer architecture was invented, recurrent networks were the most prominent architectures used in machine translation and the rest of natural language processing. It is quite surprising how little we still know about the architectures from the theoretical perspective. People often repeat a claim...

mt-weekly  en  ⏰ 2.9 min

Machine Translation Weekly 38: Taking Care about Reference Sentences

In the recent week, there were quite a lot of papers on machine translation on arXiv, at least a few of them every day. Let me have a look at one that tackles an important topic – machine translation evaluation – from a quite unusual...

mt-weekly  en  ⏰ 3.0 min

Machine Translation Weekly 37: Backtranslation and Domain Adaptation

It is sometimes fascinating to observe how each step of training neural machine translation systems gets one by one picked up by the research community, analyzed to the tiniest detail and turned into a complex recipe. Data augmentation by back-translation used to be a pretty...

mt-weekly  en  ⏰ 2.1 min

Machine Translation Weekly 36: Sign Language Translation

This week, I am going to have a look at a topic that I was not thinking about much before: sign language translation. I will comment on a bachelor thesis from Ecole Polytechnique in Paris that was uploaded to arXiv earlier this week. It was...

mt-weekly  en  ⏰ 2.6 min

Machine Translation Weekly 35: Word Translation of Transformer Layers

I always rationalized the encoder-decoder architecture as a conditional language model. The decoder is the language model, the component that “knows” the target language (whatever it means) and uses the encoder as an external memory, so it does not forget what it was talking about....

mt-weekly  en  ⏰ 2.2 min

Machine Translation Weekly 34: Echo State Neural Machine Translation

This week I am going to write a few notes on paper Echo State Neural Machine Translation by Google Research from some weeks ago. Echo state networks are a rather weird idea: initialize the parameters of a recurrent neural network randomly, keep them fixed and...

mt-weekly  en  ⏰ 1.6 min

Machine Translation Weekly 33: Document-level translation via self-fine-tuning

This week, I will have a look at a recent pre-print that presents an interesting method for document-level machine translation that is quite different from all previous ones. The title of the paper Capturing document context inside sentence-level neural machine translation models with self-training and...

mt-weekly  en  ⏰ 2.0 min

Machine Translation Weekly 32: BERT in Machine Translation

I am pretty sure everyone tried to use BERT as a machine translation encoder and who says otherwise, keeps trying. Representations from BERT brought improvement in most natural language processing tasks, why would machine translation be an exception? Well, because it is not that easy....

mt-weekly  en  ⏰ 2.5 min

Machine Translation Weekly 31: Fixing Transformer's Heads

This week, I am going to comment on a paper that appeared on arXiv on Tuesday and raised quite a lot of interest on Twitter. The title of the paper is Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation and it describes work that has...

mt-weekly  en  ⏰ 4.1 min

Machine Translation Weekly 30: A Multilignual View of Unsupervised Machine Translation

This week, it will be the third time in recent weeks when I am going to review a paper that primarily focuses on unsupervised machine translation. The title of the paper is A Multilingual View on Unsupervised Machine Translation and it describes again work done...

mt-weekly  en  ⏰ 5.7 min

Machine Translation Weekly 29: Encode, Tag, Realize - sequence transformation by learned edit operations

This week I will have a look at a paper from last year’s EMNLP that introduces a relatively simple architecture for sequence generation when the target sequence is very similar to the source sequence. The title of the paper is “Encode, Tag, Realize: High-Precisions Text...

mt-weekly  en  ⏰ 3.8 min

Machine Translation Weekly 28: mBART – Multilingual Pretraining of Sequence-to-sequence Models

The trend of model-pretraining and task-specific fine-tuning finally fully hit machine translation as well. After begin used for some time for unsupervised machine translation training, at the end of January Facebook published a model, a pre-trained sequence-to-sequence model for 25 languages at the same time....

mt-weekly  en  ⏰ 4.0 min

Machine Translation Weekly 27: Explaining the Reformer

This week I will review a paper that is not primarily about machine translation but about a neural architecture but can make a big impact on machine translation and natural language processing in general. This post is about Google’s Reformer, a neural architecture that is...

mt-weekly  en  ⏰ 8.9 min

Machine Translation Weekly 26: Unsupervised Machine Translation

One of the hottest topics in machine translation and one of the topics I ignored so far is unsupervised machine translation, i.e., machine translation trained without the use of any parallel data. I will go through a seven-month-old paper published at this year’s ACL titled...

mt-weekly  en  ⏰ 6.5 min

Machine Translation Weekly 25: Weaknesses of Reinforcement Learning for NMT

Back in 2016, one of the trendy topics was reinforcement learning and other forms of optimizing NMT directly towards some more relevant metrics rather than using cross-entropy of the conditional word distributions. Standard machine translation models are trained to maximize single-word conditional distribution, which is...

mt-weekly  en  ⏰ 2.7 min

Machine Translation Weekly 24: Cross-Lingual Ability of Multilingual BERT

After the Christmas holidays, I will once again have a look at multilingual BERT. I already discussed multilingual BERT on this blog once when I reviewed a paper that explored some cross-lingual and multilingual properties of multilingual BERT. This week’s paper does more in-depth experiments...

mt-weekly  en  ⏰ 3.3 min

Machine Translation Weekly 23: Word Sense Disambiguation in Neural Machine Translation

This week, I would like to give some thoughts about word senses and representation contextualization in machine translation. I will start by explaining why I think the current way of writing about word senses in NLP is kind of misleading and why I think we...

mt-weekly  en  ⏰ 4.1 min

Machine Translation Weekly 22: Understanding Knowledge Distillation in Non-Autoregressive Machine Translation

Last week, I discussed a paper claiming that forward-translation might be a better data augmentation technique than back-translation. This week, I will follow with a paper that touches a similar topic, but in a slightly different context. The title of the paper is Understanding Knowledge Distillation in Non-Autoregressive Machine Translation and was...

mt-weekly  en  ⏰ 3.7 min

Machine Translation Weekly 21: On Translationese and Back-Translation

Does WMT speak translationese? And who else speaks translationese? Is the success of back-translation fake news? These are the questions that implicitly ask authors of a paper called Domain, Translationese and Noise in Synthetic Data for Neural Machine Translation that was uploaded on arXiv earlier...

mt-weekly  en  ⏰ 3.3 min

Machine Translation Weekly 20: Search and Model Errors in Neural Machine translation

This week, I will have a look at a paper from this year’s EMNLP that got a lot of attention on Twitter this week. If there was an award for the most disturbing machine translation paper, this would be a good candidate. The title of...

mt-weekly  en  ⏰ 3.9 min

Machine Translation Weekly 19: Domain Robustness

This week, I will briefly have a look at a paper that discusses another major problem of current machine translation which is domain robustness. The problem is very well analyzed in a paper from the University of Zurich called Domain Robustness in Neural Machine Translation...

mt-weekly  en  ⏰ 2.4 min

Machine Translation Weekly 18: BPE Dropout

Everyone who followed natural language processing on Twitter last week must have noticed a paper called BPE-Dropout: Simple and effective Subword Regularizations that introduces a simple way of adding stochastic noise into text segmentation to increase model robustness. It sounds complicated, but it is fairly easy. As...

mt-weekly  en  ⏰ 2.5 min

Machine Translation Weekly 17: When is Document-Level Context Useful?

One of the biggest limitations of current machine translation systems is they only work with isolated sentences. The systems need to guess when it comes to phenomena that cross the (often rather arbitrary) sentence boundaries. The typical example that is mentioned everywhere is the translation...

mt-weekly  en  ⏰ 3.9 min

Machine Translation Weekly 16: Hybrid character-level and word-level machine translation

One of the topics I am currently dealing with in my research is character-level modeling for neural machine translation. Therefore, I was glad to see a paper that appeared on arXiv last week called On the Importance of Word Boundaries in Character-level Neural Machine Translation that shows an interesting...

mt-weekly  en  ⏰ 5.1 min

Machine Translation Weekly 15: How Multilingual is Multiligual BERT?

This week, I will slightly depart from machine translation and have a look at a paper How Multilingual is Multilingual BERT by Google Research. BERT, the Sesame Street muppet that recently colonized the whole area of natural language processing is a model trained to predict...

mt-weekly  en  ⏰ 3.8 min

Machine Translation Weekly 14: Modeling Confidence in Sequence-to-Sequence Models

Neural machine translation is based on machine learning—we collect training data, pairs of parallel sentences which we hope represent how language is used in the two languages, and train models using the data. When the model is trained, the more the input resembles the sentences...

mt-weekly  en  ⏰ 2.7 min

Machine Translation Weekly 13: Long Distance Dependencies

Let us follow up on the gender paper and have a look at other cases where machine translation does not work as well as we would like it to work. This time, we will have a look at a paper that talks about grammatically complex...

mt-weekly  en  ⏰ 3.3 min

Machine Translation Weekly 12: Memory-Augmented Networks

Five years ago when deep learning slowly started to be cool, there was a paper called Neural Turing Machines (which are not really Turing machines, but at least they are neural in a narrow technical sense). The paper left me with a foolishly naive impression...

mt-weekly  en  ⏰ 2.8 min

Machine Translation Weekly 11: Gender and Machine Translation

It’s time to talk about gender—why things go wrong with gender in machine translation and what people do about it. Some languages have gendered nouns (German), some have gendered almost everything (Czech, French) and some only few pronouns (English). Let’s say you want to translate...

mt-weekly  en  ⏰ 2.7 min

Machine Translation Weekly 10: Massively Multilingual Neural Machine Translation

The holiday period is over and I almost settled in my new place of operation which is the Ludwig-Maximilian University of Munich, and now there is nothing that can prevent me from continuing with weekly reports on what is new in the world of machine...

mt-weekly  en  ⏰ 2.4 min

Machine Translation Weekly 9: Shared Task on Machine Translation Robustness

Machine translation is typically trained on bilingual data that can be found on the Internet. It mostly comes from international government and non-government organizations, commercial web presentations, books, movie subtitles, etc. Therefore, most of the text is quite formal and almost without typos and certainly...

mt-weekly  en  ⏰ 2.7 min

Machine Translation Weekly 8: A Generalized Framework of Sequence Generation

This week’s post contains more math than usually. I will talk about a paper that unifies several decoding algorithms in MT using one simple equation. The paper is called A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models, it comes from New...

mt-weekly  en  ⏰ 4.6 min

Machine Translation Weekly 7: Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

Remember two years ago when all tech-related servers enthusiastically reported that a translator by Google created its own language? These were based on a paper that was published in TAACL in summer 2017 after its pre-print was available on arXiv since November 2016. The paper...

mt-weekly  en  ⏰ 3.4 min

Machine Translation Weekly 6: Probing the Need for Visual Context in Multimodal Machine Translation

This week, we will have a look at a paper that won the best short paper award at NAACL 2019. The name of the paper is Probing the Need for Visual Context in Multimodal Machine Translation and it was written by friends of mine from...

mt-weekly  en  ⏰ 2.3 min

Machine Translation Weekly 5: Revisiting Low-Resource Neural Machine Translation

Let’s continue with pre-prints of papers which are going to appear at ACL this year and have a look at another paper that comes from the University of Edinburgh, titled Revisiting Low-Resource Neural Machine Translation: A Case Study. This paper is a reaction to an...

mt-weekly  en  ⏰ 2.0 min

Machine Translation Weekly 4: Analyzing Multi-Head Self-Attention

With the ACL camera-ready deadline slowly approaching, future ACL papers start to pop up on arXiv. One of those which went public just a few days ago is a paper called Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned...

mt-weekly  en  ⏰ 4.0 min

Machine Translation Weekly 3: Constant-Time Machine Translation with Conditional Masked Language Models

This week, we will have a look at a brand-new method for non-autoregressive machine translation published a few weeks ago on arXiv by Facebook AI Research, two days before the anonymity period for the EMNLP conference. Most models for neural machine translation work autoregressively. When...

mt-weekly  en  ⏰ 2.9 min

Machine Translation Weekly 2: BERTScore

Last week, there was a paper on arXiv that introduces a method for MT evaluation using BERT sentence representation. The metric seems to be the new state of the art in MT evaluation. Its name is BERTScore and was done at Cornell University. MT evaluation...

mt-weekly  en  ⏰ 2.7 min

Machine Translation Weekly 1: Bidirectional Decoding

This is the first post from a series in which I will try to come up with summaries of some of the latest papers and other news on machine translation. The main goal of this exercise is to force myself to read new papers regularly...

mt-weekly  en  ⏰ 2.9 min

Tak trochu fake news o umělé inteligenci

English version of the post Když si v médiích čtu o technologiích, které využívají strojové učení, a o umělé inteligenci, jak se teď říká, často se divím, jak jsou zprávy nejen zjednodušující, ale především zavádějící. Určitě se to děje ve všech oborech, nicméně rozhořčování se...

popularization  cs  ⏰ 5.4 min

Kind of fake news on artificial intelligence

Česká verze příspěvku While reading news stories on research or products involving deep learning, I get often surprised how inaccurate and misleading the news stories are. It is probably a problem of almost all expert fields which happen to appear in media, luckily they do...

popularization  en  ⏰ 5.9 min

Nesnesitelná soutěživost umělých inteligentů

English version of the post Odmala jsem si myslel, že biatlon je divný sport. Vrtalo mi hlavou, jak někoho napadlo soutěžit v tak odlišných věcech jako je ježdění na lyžích a střelba. O trochu větší překvapení přišlo, když jsem se dozvěděl o existenci moderního pětiboje. Díky...

popularization  cs  ⏰ 9.1 min

Further, faster, stronger, dear AI

Česká verze příspěvku Since I was a little boy, I was astonished how weird sport biathlon is. I couldn’t imagine how could someone possible invent a combination of cross-country skying and shooting. It blew my mind when I found out there is even weirder combination...

popularization  en  ⏰ 6.8 min

Computational Linguistics in the 21st century – a private manifesto of its perpetual student

Česká verze příspěvku In this essay, I would like to sum up my opinions on what is the role of computational linguistics, why should people concern with it and I believe are its current problems, and most importantly why it is a very exciting field...

popularization  en  ⏰ 8.2 min

Počítačová lingvistika 21. století – soukromý manifest jejího věčného studenta

English version of the post V tomto příspěvku se pokusím zaujatě a angažovaně shrnout, co je to počítačová (chcete-li komputační nebo matematická) lingvistika, jaké jsou její současné problémy a proč je i přesto fascinujícím oborem, kterým má smysl se zabývat. Počítačová lingvistika je cosi mezi...

popularization  cs  ⏰ 7.3 min

Deep learning a psaní I/Y

English version of the post Z deep learningu (hlubokého učení – strojového učení pomocí neuronových sítí) se v posledních letech stal buzzword technologického světa. Můžeme číst články, co dovede umělá inteligence (vyčpělé artificial intelligence se s oblibou nahrazuje pojmem machine intelligence) – jak dovede vyřešit automatický překlad,...

popularization  cs  ⏰ 10.1 min

Spell checking of y and in Czech using deep learning

Česká verze příspěvku In the recent years, deep learning (machine learning with neural networks) became a frequently used buzzword in the technological word. We can find plenty of articles on how machine intelligence (a new, probably sexier term for artificial intelligence) can solve machine translation,...

popularization  en  ⏰ 9.9 min

What is Neural Machine Translation Capable of?

Česká verze příspěvku A year ago at EMNLP in Lisbon I saw a paper called On Statistical Machine Translation and Translation Theory by Christian Hardmeier. He was standing in front of his poster and almost apologized to everybody who passed by his poster that the...

popularization  en  ⏰ 4.7 min

Co dovede neuronový překlad?

English version of the post Před rokem jsem na konferenci EMNLP v Lisabonu zahlédl článek, který se jmenoval On Statistical Machine Translation and Translation Theory (O statistickém strojovém překladu a teorii překladu) od Christiana Hardmeiera. Stál před svým posterem a každému, kdo se u jeho posteru...

popularization  cs  ⏰ 4.2 min

subscribe via RSS