We are building a multilingual Large Language Model tailored for Bahasa Indonesia and regional languages. We are looking for a passionate Lead Data Scientist to help shape the future of open and inclusive AI for Indonesia, as well as playing a pivotal role in identifying impactful AI use cases. As a Lead Data Scientist working on LLMs, you will design and build high-quality datasets, advanced model pre-training, fine tuning and and alignment techniques, and collaborate closely with product and engineering teams to ship safe, reliable LLM-powered features to millions of users. This role offers the opportunity to drive innovation, solve critical business challenges, and shape the future of AI-driven solutions at GoTo Group.
What You Will Do
Work with large-scale multilingual corpora, including text, audio, and image modalities
Build high-quality datasets for both continual pretraining,post-training (SFT, RLHF, DPO), and benchmark evaluation
Contribute to the training and scaling of multilingual LLMs – from continual pretraining to supervised fine-tuning and alignment.
Implement state of the art methods and research for efficient and scalable operations.
Implement and improve safety alignment and guardrail systems to ensure responsible and culturally appropriate model behavior.
Collaborate closely with business/product, engineers to deploy production-gradeLLM-powered solutions.
Stay current with advancements in AI technologies. Frontier models, training methodologies etc
What You Will Need
7+ years of experience in deep learning, nlp, computer vision, voice.
Proficient in data preprocessing, model training, evaluation, and optimisation.
Practical experience in applying deep learning to solve real business problems, with models successfully deployed and used in production environments.
Proficient with Python and deep learning frameworks such as PyTorch or Tensorflow.
Experience with cloud platforms like Alicloud, GCP or AWS.
Strong communication skills to understand business needs and effectively convey analytical solutions.
Ability to write clear and concise technical documentation.A Master’s or PhD in Computer Science, Data Science, AI, or a related field.