Senior Data Engineer - Machine Learning in Barcelona


Share offer

Job Description

At Preply, we are unlocking human potential through learning.

We believe learning with a great tutor is life-changing. That’s why we match online tutors from across the globe with learners and empower them to create live language classes with AI-powered tools and learning materials. This is how we deliver progress, create engagement and keep our global community of learners motivated. So far, over 32,000 tutors have given more than 15 million lessons to learners from over 175 countries.

Meet the team!

Our Data Strategy team collects and processes information from multiple sources and makes it available to the organization through our self-service analytics layer (you can check our tech stack here). Our Analytics teams leverage such data to generate insights and build predictive models. Together we inform the business, highlight opportunities, develop data-driven core features, and optimize acquisition & monetisation. Ultimately, we have a direct impact on how we grow our business while improving our product and the learners and tutors' experience.

At Preply, we are constantly aiming to empower the business and product with AI capabilities. We are fully committed to developing our ML platform in symphony with the rest of the data architecture to enable a wide team of Data Scientists to deploy multiple models across the organization. This, along with the wide scope of our product and the sophistication of the tutor/learner interactions, makes for some rewarding challenges.

Do you love data? Do you want to work with a modern data stack using Snowflake, Airflow, DBT, Looker, Databricks, Monte Carlo and learn about these tools or participate in choosing the next ones? Do you love SQL or Spark? Do you love transforming and orchestrating data? Do you believe the data team should adopt data testing and data governance processes? Join our Data chapter!

As part of this role, you will participate in critical architectural decisions, evaluating multiple approaches to empower our people to inform their decisions with quality data. You will heavily contribute to selecting the appropriate tech stack and defining new data/ML standards.

What you’ll be doing:

  • Enable DS/ML Teams: Assist data science and machine learning teams in building and operationalizing their critical ML models, providing guidance and expertise throughout the process.
  • Develop the MLOps framework: Make sure the ML capabilities of the company are extended to multiple teams, enabling reusability and standardization through an effective MLOps framework.
  • Scaling and Evolving Infrastructure: Enhance our multi-tiered data infrastructure, focusing on cost efficiency, performance optimization, scalability, and reliability.
  • Designing Future-Ready Solutions: Develop and implement innovative strategies to meet current requirements while anticipating the evolving needs of our internal ML platform.
  • Mentoring Data Engineers: Serve as a mentor to fellow Data Engineers, fostering their professional growth and skill development.
  • Collaborating with the rest of the Data Platform: Work closely with our existing Data Platform team to ensure seamless integration and expansion of our data models and unlock new use cases.
  • Enhancing Data Processing: Improve data ingestion and processing capabilities in our data lake, data warehouse, and ML platform.
  • Implementing Data Quality Measures: Implement data quality tools and processes to ensure the integrity and reliability of our data assets.

What you need to succeed:

  • Bachelors or Masters Degree in Computer Science or Engineering (or equivalent work experience).
  • Previous experience as a Senior Data Engineer or in a similar role is essential.
  • Strong coding skills in languages like Python or Scala, coupled with proficient SQL experience. Experience with distributed bulk data processing frameworks like Spark or Presto is required.
  • Previous experience in setting and scaling DataBricks in a medium-sized organization is a must.
  • Experience working with MLOps tooling, such as Feast for a feature store, MLFlow for experiment tracking, and model registry.
  • Cloud Expertise. Proficiency in maintaining cloud-based, high-volume distributed data processing platforms, such as AWS or GCP.
  • Kubernetes Proficiency. Ability to deploy and support tooling on Kubernetes.
  • Strong curiosity, problem-solving abilities, and a knack for identifying and addressing complex issues.
  • Innovative Mindset. Creative thinking and a proactive attitude toward devising and evaluating new solutions.
  • Excellent written and verbal communication skills in English are a must.

Nice to have:

  • Previous experience in scaling startup infrastructures would be advantageous.
  • Familiarity with coordination tools like Airflow, Luigi, or Jenkins.
  • Experience managing multi-tiered data systems (data lake + data warehousing architectures).
  • Understanding and proficiency in Terraform for infrastructure-as-code.
  • Experience building production ML pipelines using Python (PySpark) or Scala (Spark).

Why you’ll love it at Preply:

  • An open, collaborative, dynamic and diverse culture;
  • A generous monthly allowance for lessons on, Learning & Development budget and time off for your self-development;
  • A competitive financial package with equity, leave allowance and health insurance;
  • Not in Barcelona? We offer an attractive relocation package to join us in our Preply Barcelona Hub
  • Access to free mental health support platforms;
  • Access to Gympass-partnered wellness and gym centers throughout Spain to promote and support well-being and physical health;
  • The opportunity to unlock the potential of learners and tutors through language learning and teaching in 175 countries (and counting!).

Our Principles

  • Care to change the world - We are passionate about our work and care deeply about its impact to be life changing.
  • We do it for learners - For both Preply and tutors, learners are why we do what we do. Every day we focus on empowering tutors to deliver an exceptional learning experience.
  • Keep perfecting - To create an outstanding customer experience, we focus on simplicity, smoothness, and enjoyment, continually perfecting it as every detail matters.
  • Now is the time - In a fast-paced world, it matters how quickly we act. Now is the time to make great things happen.
  • Disciplined execution - What makes us disciplined is the excellence in our execution. We set clear goals, focus on what matters, and utilize our resources efficiently.
  • Dive deep - We leverage business acumen and curiosity to investigate disparities between numbers and stories, unlocking meaningful insights to guide our decisions.
  • Growth mindset - We proactively seek growth opportunities and believe today's best performance becomes tomorrow's starting point. We humbly embrace feedback and learn from setbacks.
  • Raise the bar - We raise our performance standards continuously, alongside each new hire and promotion. We build diverse and high-performing teams that can make a real difference.
  • Challenge, disagree and commit - We value open and candid communication, even when we don’t fully agree. We speak our minds, challenge when necessary, and fully commit to decisions once made.
  • One Preply - We prioritize collaboration, inclusion, and the success of our team over personal ambitions. Together, we support and celebrate each other's progress.

Diversity, Equity, and Inclusion

Preply is committed to creating a diverse and inclusive environment where people from all backgrounds can thrive. Different opinions and viewpoints are key ingredients in our success as a multicultural Ed-Tech company.

Preply will consider all applications for employment without regard to race, color, religion, gender identity or expression, sexual orientation, national origin, disability, age or veteran status. Together, we are The World Class.


About Preply

  • Edtech

  • Brighton, MA

  • 50-200

  • 2012


Other data engineer jobs that might interest you...