Local Byte Fusion for Neural Machine Translation
Collaborating with researchers from the University of Wisconsin-Madison, we put forward a fundamental tokenization method that we term Local Byte Fusion (LOBEF) for byte-based machine translation. The method outperforms other techniques, especially when it comes to multilingual translation.
Unlike the current dominant tokenization technique, subword tokenization, which has limitations on multilingual corpus, LOBEF utilizes byte n-gram and word boundaries to aggregate local semantic information. Thus, it has advantages on multilingual corpus with universal tokenization schema. In experiments, our method outperforms traditional byte-based models and subword techniques on multilingual translation, zero-shot cross-lingual transfer and domain adaptation.
LOBEF contains both n-gram Convolution Fusion (nCF) and Word-based Self-attention Fusion (WSF). In nCF, we use four convolutional layers to aggregate character level information, and in WSF, we use word boundaries with block-wise self attention to aggregate word level information. Our results indicate that byte based models outperform subword baselines on the Flores-101 dataset. Additionally, byte based models are smaller than comparable subword models and are 20% faster to train.
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
South Korea’s leading presidential candidate wants more time on US tariff deal
Share link:In this post: Lee Jae-myung wants the July 8 US-South Korea tariff deal deadline extended. He says diplomacy must benefit both sides and not feel forced. If elected, he’ll launch an economic task force and push a stimulus budget.

Meta makes bold claim in final argument of FTC trial
Share link:In this post: Meta ended its defense in the FTC antitrust trial, arguing it helped Instagram and WhatsApp grow instead of holding them back. The company pointed to rising competition from TikTok and YouTube, saying the real battle is for users’ attention, not just user numbers. Judge Boasberg will now decide if Meta’s control of social networking apps harms competition, which could lead to the dissolution of its past acquisitions.

Solana Plunges 5% as Midnight Sell-Off Signals Institutional Exit
R3 Taps Solana to Bridge $10B+ RWAs to Public Chain
R3 partners with Solana to bring over $10B of tokenized RWAs from Corda to public blockchain infrastructure.Why Solana?

Trending news
MoreCrypto prices
More








