Projects
Large Commercial Neural Machine Translation
January 2023 to September 2023
- Responsibility: developing commercial NMT with LLM, designing the multilingual training data proportion
- Achievements: Performance based on PanGu LLM on par with existing end-to-end NMT model in 10+ languages
End-to-end Simultaneous Speech Translation
July 2022 to May 2023
- Responsibility: designing and evaluating the first end-to-end simultaneous translation system for Huawei meeting
- Achievements: The end-to-end simultaneous translation system has lower latency and higher performance than the cascaded translation system
Cascade Simultaneous Speech Translation
June 2021 to July 2022
- Responsibility: designing and developing cascade simultaneous translation system for Huawei meeting
- Achievements: SST end-to-end BLEU scores (EN<-> ZH) on par with existing commercial systems; The service successfully supported 110 + online conference sessions and 30 + onsite SST translation service for 2021/2022 Huawei STW event
Knowledge-enhanced NMT
September 2020 to May 2021
- Responsibility: designing, implementing and evaluating knowledge-enhance NMT algorithms based on topic modelling and homographic representation learning (HDR)
- Achievements: BLEU enhancement on the vanllia transformer Topic-enhanced model (+1.57 EN->DE), HDR (+2.3 in EN->RU) with algorithm deployed in related Huawei machine translation service; 2 peer-reviewed conference papers
Domain-specific NMT
June 2020 to February 2021
- Responsibility: developing ICT domain-specific machine translation model leveraging domain dictionaries
- Achievements: BLEU increase +0.7 on SOTA EN->ZH working model
Machine Translation Data Processing Pipeline
July 2019 to July 2020
- Responsibility: lead developing corpus processing pipeline consisting of corpus cleaning, tokenize, data enhancement, and other functions like visualization
- Achievements: Huawei machine translation data processing pipeline V1.0