This position may no longer be available. It hasn't been seen on any source for over 30 days.
Looking for similar roles? Browse our latest verified listings.
Data Engineer (Team Lead - Localization & Language Data)
ATAlexa Translations — Toronto, Ontario🇨🇦
Job Details
Description
About Alexa Translations
Alexa Translations provides A.I.-powered translations for the largest and most prestigious legal, financial, and government institutions. Our unique combination of advanced technology and professionally certified translators deliver tailored solutions with unparalleled quality. Thanks to over two decades of award-winning client success, you can rely on us as a true extension of your team.
Our core values:
• Innovation
• Dedication
• Fanatical commitment to quality and service
• Resourcefulness
• Collaboration
Role Overview
As the Data Engineering Team Lead, you will be the technical lead and people manager for a specialized team at the intersection of Big Data, Global Communications, and Generative AI. You will oversee the development of our enterprise Data Warehouse, ensuring that our language assets (TMs, Glossaries, and Metadata) are structured, searchable, and optimized for human translators, machine translation searches and machine learning models.
Beyond traditional data engineering, you will collaborate with multiple teams on the design of the platform interface and the indexing strategies that power our next-generation localization workflows. Your unique value lies in bridging the gap between high-level data architecture, the nuanced translation domain, and the emerging requirements of Retrieval-Augmented Generation (RAG).
Key Responsibilities
1. Data Strategy & Warehousing
• Architecture: Define the roadmap for our data warehouse, ensuring high availability and performance for massive multilingual datasets.
• Data Cataloging & Governance: Implement robust cataloging solutions to ensure data lineage and "discoverability" across the organization.
• Interface Development: Lead the creation of a user-centric interface that allows stakeholders to interact with, query, and extract data from the platform.
2. Translation & Localization Domain
• Linguistic Asset Management: Manage the lifecycle of Translation Memories (TMs) and Terminology Databases.
• Systems Expertise: Optimize integrations between our data platform and CAT tools and TMS systems (e.g., Phrase, Trados, MemoQ).
• Domain Integration: Ensure data pipelines respect the nuances of translation metadata, XLIFF structures, and regional variants.
3. ML & GenAI Integration
• RAG & Indexation: Oversee the creation and maintenance of Vector Databases and semantic search indexes to support Retrieval-Augmented Generation for automated translation and content creation.
• Data Preparation for LLMs: Architect pipelines that clean, chunk, and format localization data for fine-tuning or prompting Large Language Models (LLMs).
• Quality & Evaluation: Support the implementation of automated quality estimation (QE) and LLM-based evaluation metrics for translated content.
4. Leadership & Mentorship
• Team Management: Lead a cross functional team of Linguists, Software Developers, Devops and Localization Engineers, providing technical guidance and mentorship.
• Cross-functional Collaboration: Act as the liaison between Data, Localization, and AI/ML Research teams.
Required Qualifications
• Experience: 5+ years in Data Engineering, with at least 2 years in a leadership or senior capacity.
• Technical Stack: Proficiency in SQL, Python, ETL Pipelines, and cloud data platforms (e.g., AWS S3 Data Lakes, AWS Athena, AWS Redshift, AWS Glue).
• AI/ML Fundamentals: Solid understanding of the GenAI lifecycle, specifically regarding how data is indexed for RAG (e.g., Pinecone, Milvus, or Qdrant).
• Domain Knowledge: Understanding of the localization industry, including experience with TMX, TBX, and CAT tool workflows.
• Product Mindset: Experience building and deploying production ready internal tools or interfaces (e.g., Streamlit, React) to democratize data access.
Preferred Skills
• Familiarity with embedding models and semantic similarity scoring.
• Knowledge of Data Privacy (ISO 27001, GDPR) specifically regarding PII in linguistic datasets.
Benefits & Perks You’ll Love:
• Comprehensive Health Insurance: Including vision, dental, complementary therapies, and support for your overall well-being.
• Your Birthday Off: We celebrate your special day!
• 6 Personal/Sick Days: Take the time you need for your health or life’s unexpected moments.
• Work-Ready Equipment: Get the tools you need to succeed, provided upon request.
• Hybrid Work Model: Enjoy the best of both worlds with a mix of in-office collaboration and remote flexibility.
• Learning & Growth Opportunities: Training and resources tailored to your role and department.
• Supportive & Collaborative Team Culture: Work alongside team members who genuinely have your back
• Team Recognition & Action Awards: Celebrate wins and contributions in meaningful ways.
• Employee Referral Program: Earn rewards for bringing amazing talent to our team.
#Li-hybrid
m7kkcoJiXD
Comments
Sign in to leave a comment
Verification
70/ 100medium
+Recently posted (3 days ago) -- actively hiring
+Thorough job description -- indicates a genuine, active role
~Sourced from jsearch
Verified by
system on Apr 13
Trust Signals
Listing Age
56 days
Multi-Source
Single source
Repost Count
0
First Seen
Apr 13
Last Seen
Apr 13
Company
Similar jobs in other countries
Structured Finance Manager (m/f/d)
80Credibur · 🇩🇪 Berlin
(USA) Director, Business Analysis And Insights - Data
80Walmart · 🇺🇸 Centerton, Arkansas
Sales & Strategy Intern - CCO Office, DACH
65Parcel Perform · 🇩🇪 Berlin
Senior SEO & GEO Specialist (m/f/d)
65Pflegia · 🇩🇪 Berlin
Senior Software Engineer II, Core Experience
60Instacart · 🇺🇸 United States
Senior Machine Learning Infrastructure Engineer
50Unity · 🇺🇸 United States
About Job Verification