Here is what I found the most interesting in MT and multilinguality in March

  1. I only feature two papers (both from Microsoft, co-incidence), not because there were too few on arXiv, but because I did not manage to read that much this month.

DiTTO: A Feature Representation Imitation Approach for Improving Cross-Lingual Transfer

In this paper, folks from Microsoft in India experiment with zero-shot crosslingual transfer for classification. They use a multi-task learning setup. Besides performing the task in the source language, they teach the model using a two-player game. There is a discriminator (or adversarial classifier) that tries to detect what language the sentence is in. The model itself then tries to fool the discriminator and conceal the language identity. They get good results on cross-lingual natural language inference, including the difficult AmericasNLI data that contain indigenous languages of the Americas. It is very similar to what we did back in 2020; however, we did that only for tasks that directly use the representations and did not get

Large Language Models Are State-of-the-Art Evaluators of Translation Quality

Folks from Microsoft in Munich showed that ChatGPT can be used for machine translation evaluation. MT evaluation seems to be one of the emergent properties of language models. They experimented with a bunch of OpenAI language models, and this ability only appeared starting from GPT 3.5 (Davinci-002). They use a zero-shot approach, i.e., they prompt the model with the same instructions human annotators get and let it generate a score of how good the translation is and they ensemble results across prompts. The results are much better than all previously known metrics (including the newest version of COMET).

What else is going on…

Chatbots are now everywhere. Folks from Stanford released their Alpaca: they took Meta’s LLAMA (which is not really open source), fine-tuned it on data generated by ChatGPT (probably against OpenAI’s terms of use), and made a cool chatbot like this. A few weeks later, there is Vicuna, an even cooler chatbot based on the 13B Llama model.

OpenAI released GPT-4 and did not say anything about it. It is multimodal and better than previous models.

Folks connected to the longtermism movement started a petition demanding a 6-month moratorium on training LLMs to gain some time and use the time to study long-term effects of AI. They do not specify what the long term issues they mean. Longtermists are known to be willing to sacrifice the welfare of real people for the glory and greatness of potential future people. (You know, it may be quadrillions of people colonizing the entire galaxy in a distant future. Billions of real people on Earth struggling right now is nothing against it.) Also, I am not sure if six-month moratorium would realy help.

As a reaction, German non-profit LIANON (who is also behind the OpenAssistant) started a petition for an AI CERN, a center for large scale AI research that would be under public control.