The ultimate goal of AI is a world taking care of itself, so the humans can finally become a pack of carefree, barefooted, long-haired hippies. (← irony)

Posts

Machine Translation Weekly 22: Understanding Knowledge Distillation in Non-Autoregressive Machine Translation

Last week, I discussed a paper claiming that forward-translation might be a better data augmentation technique than back-translation. This week, I will follow with a paper that touches a similar topic, but in a slightly different context. The title of the paper is Understanding Knowledge Distillation in Non-Autoregressive Machine Translation and was...

mt-weekly  en 

Machine Translation Weekly 21: On Translationese and Back-Translation

Does WMT speak translationese? And who else speaks translationese? Is the success of back-translation fake news? These are the questions that implicitly ask authors of a paper called Domain, Translationese and Noise in Synthetic Data for Neural Machine Translation that was uploaded on arXiv earlier...

mt-weekly  en 

Machine Translation Weekly 20: Search and Model Errors in Neural Machine translation

This week, I will have a look at a paper from this year’s EMNLP that got a lot of attention on Twitter this week. If there was an award for the most disturbing machine translation paper, this would be a good candidate. The title of...

mt-weekly  en 

Machine Translation Weekly 19: Domain Robustness

This week, I will briefly have a look at a paper that discusses another major problem of current machine translation which is domain robustness. The problem is very well analyzed in a paper from the University of Zurich called Domain Robustness in Neural Machine Translation...

mt-weekly  en 

Machine Translation Weekly 18: BPE Dropout

Everyone who followed natural language processing on Twitter last week must have noticed a paper called BPE-Dropout: Simple and effective Subword Regularizations that introduces a simple way of adding stochastic noise into text segmentation to increase model robustness. It sounds complicated, but it is fairly easy. As...

mt-weekly  en 

Machine Translation Weekly 17: When is Document-Level Context Useful?

One of the biggest limitations of current machine translation systems is they only work with isolated sentences. The systems need to guess when it comes to phenomena that cross the (often rather arbitrary) sentence boundaries. The typical example that is mentioned everywhere is the translation...

mt-weekly  en 

Machine Translation Weekly 16: Hybrid character-level and word-level machine translation

One of the topics I am currently dealing with in my research is character-level modeling for neural machine translation. Therefore, I was glad to see a paper that appeared on arXiv last week called On the Importance of Word Boundaries in Character-level Neural Machine Translation that shows an interesting...

mt-weekly  en 

Machine Translation Weekly 15: How Multilingual is Multiligual BERT?

This week, I will slightly depart from machine translation and have a look at a paper How Multilingual is Multilingual BERT by Google Research. BERT, the Sesame Street muppet that recently colonized the whole area of natural language processing is a model trained to predict...

mt-weekly  en 

Machine Translation Weekly 14: Modeling Confidence in Sequence-to-Sequence Models

Neural machine translation is based on machine learning—we collect training data, pairs of parallel sentences which we hope represent how language is used in the two languages, and train models using the data. When the model is trained, the more the input resembles the sentences...

mt-weekly  en 

Machine Translation Weekly 13: Long Distance Dependencies

Let us follow up on the gender paper and have a look at other cases where machine translation does not work as well as we would like it to work. This time, we will have a look at a paper that talks about grammatically complex...

mt-weekly  en 

Machine Translation Weekly 12: Memory-Augmented Networks

Five years ago when deep learning slowly started to be cool, there was a paper called Neural Turing Machines (which are not really Turing machines, but at least they are neural in a narrow technical sense). The paper left me with a foolishly naive impression...

mt-weekly  en 

Machine Translation Weekly 11: Gender and Machine Translation

It’s time to talk about gender—why things go wrong with gender in machine translation and what people do about it. Some languages have gendered nouns (German), some have gendered almost everything (Czech, French) and some only few pronouns (English). Let’s say you want to translate...

mt-weekly  en 

Machine Translation Weekly 10: Massively Multilingual Neural Machine Translation

The holiday period is over and I almost settled in my new place of operation which is the Ludwig-Maximilian University of Munich, and now there is nothing that can prevent me from continuing with weekly reports on what is new in the world of machine...

mt-weekly  en 

Machine Translation Weekly 9: Shared Task on Machine Translation Robustness

Machine translation is typically trained on bilingual data that can be found on the Internet. It mostly comes from international government and non-government organizations, commercial web presentations, books, movie subtitles, etc. Therefore, most of the text is quite formal and almost without typos and certainly...

mt-weekly  en 

Machine Translation Weekly 8: A Generalized Framework of Sequence Generation

This week’s post contains more math than usually. I will talk about a paper that unifies several decoding algorithms in MT using one simple equation. The paper is called A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models, it comes from New...

mt-weekly  en 

Machine Translation Weekly 7: Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

Remember two years ago when all tech-related servers enthusiastically reported that a translator by Google created its own language? These were based on a paper that was published in TAACL in summer 2017 after its pre-print was available on arXiv since November 2016. The paper...

mt-weekly  en 

Machine Translation Weekly 6: Probing the Need for Visual Context in Multimodal Machine Translation

This week, we will have a look at a paper that won the best short paper award at NAACL 2019. The name of the paper is Probing the Need for Visual Context in Multimodal Machine Translation and it was written by friends of mine from...

mt-weekly  en 

Machine Translation Weekly 5: Revisiting Low-Resource Neural Machine Translation

Let’s continue with pre-prints of papers which are going to appear at ACL this year and have a look at another paper that comes from the University of Edinburgh, titled Revisiting Low-Resource Neural Machine Translation: A Case Study. This paper is a reaction to an...

mt-weekly  en 

Machine Translation Weekly 4: Analyzing Multi-Head Self-Attention

With the ACL camera-ready deadline slowly approaching, future ACL papers start to pop up on arXiv. One of those which went public just a few days ago is a paper called Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned...

mt-weekly  en 

Machine Translation Weekly 3: Constant-Time Machine Translation with Conditional Masked Language Models

This week, we will have a look at a brand-new method for non-autoregressive machine translation published a few weeks ago on arXiv by Facebook AI Research, two days before the anonymity period for the EMNLP conference. Most models for neural machine translation work autoregressively. When...

mt-weekly  en 

Machine Translation Weekly 2: BERTScore

Last week, there was a paper on arXiv that introduces a method for MT evaluation using BERT sentence representation. The metric seems to be the new state of the art in MT evaluation. Its name is BERTScore and was done at Cornell University. MT evaluation...

mt-weekly  en 

Machine Translation Weekly 1: Bidirectional Decoding

This is the first post from a series in which I will try to come up with summaries of some of the latest papers and other news on machine translation. The main goal of this exercise is to force myself to read new papers regularly...

mt-weekly  en 

Tak trochu fake news o umělé inteligenci

English version of the post Když si v médiích čtu o technologiích, které využívají strojové učení, a o umělé inteligenci, jak se teď říká, často se divím, jak jsou zprávy nejen zjednodušující, ale především zavádějící. Určitě se to děje ve všech oborech, nicméně rozhořčování se...

popularization  cs 

Kind of fake news on artificial intelligence

Česká verze příspěvku While reading news stories on research or products involving deep learning, I get often surprised how inaccurate and misleading the news stories are. It is probably a problem of almost all expert fields which happen to appear in media, luckily they do...

popularization  en 

Nesnesitelná soutěživost umělých inteligentů

English version of the post Odmala jsem si myslel, že biatlon je divný sport. Vrtalo mi hlavou, jak někoho napadlo soutěžit v tak odlišných věcech jako je ježdění na lyžích a střelba. O trochu větší překvapení přišlo, když jsem se dozvěděl o existenci moderního pětiboje. Díky...

popularization  cs 

Further, faster, stronger, dear AI

Česká verze příspěvku Since I was a little boy, I was astonished how weird sport biathlon is. I couldn’t imagine how could someone possible invent a combination of cross-country skying and shooting. It blew my mind when I found out there is even weirder combination...

popularization  en 

Computational Linguistics in the 21st century – a private manifesto of its perpetual student

Česká verze příspěvku In this essay, I would like to sum up my opinions on what is the role of computational linguistics, why should people concern with it and I believe are its current problems, and most importantly why it is a very exciting field...

popularization  en 

Počítačová lingvistika 21. století – soukromý manifest jejího věčného studenta

English version of the post V tomto příspěvku se pokusím zaujatě a angažovaně shrnout, co je to počítačová (chcete-li komputační nebo matematická) lingvistika, jaké jsou její současné problémy a proč je i přesto fascinujícím oborem, kterým má smysl se zabývat. Počítačová lingvistika je cosi mezi...

popularization  cs 

Deep learning a psaní I/Y

English version of the post Z deep learningu (hlubokého učení – strojového učení pomocí neuronových sítí) se v posledních letech stal buzzword technologického světa. Můžeme číst články, co dovede umělá inteligence (vyčpělé artificial intelligence se s oblibou nahrazuje pojmem machine intelligence) – jak dovede vyřešit automatický překlad,...

popularization  cs 

Spell checking of y and in Czech using deep learning

Česká verze příspěvku In the recent years, deep learning (machine learning with neural networks) became a frequently used buzzword in the technological word. We can find plenty of articles on how machine intelligence (a new, probably sexier term for artificial intelligence) can solve machine translation,...

popularization  en 

What is Neural Machine Translation Capable of?

Česká verze příspěvku A year ago at EMNLP in Lisbon I saw a paper called On Statistical Machine Translation and Translation Theory by Christian Hardmeier. He was standing in front of his poster and almost apologized to everybody who passed by his poster that the...

popularization  en 

Co dovede neuronový překlad?

English version of the post Před rokem jsem na konferenci EMNLP v Lisabonu zahlédl článek, který se jmenoval On Statistical Machine Translation and Translation Theory (O statistickém strojovém překladu a teorii překladu) od Christiana Hardmeiera. Stál před svým posterem a každému, kdo se u jeho posteru...

popularization  cs 

subscribe via RSS