cubierta

Esta oferta ya no está disponible

Data Engineer en Madrid

Ontruck

Categoría

Ingeniero de Datos

Industria

Industria Logistics

Lugar de trabajo

En sede

Horas

Full-Time

Prácticas

false

Habilidades

Spark Python Kubernetes Kinesis Kafka ETL Docker

Comparte la oferta

Descripción de la oferta

About Ontruck

Ontruck is transforming the road transportation industry, €600 billion just in Europe. We make trucking simple, transparent and on-demand.

Ontruck is a B2B logistics platform, connecting companies looking to move pallets with our network of carriers. We offer shippers a web platform to make the process of finding the right truck quick and simple with built-in track and trace. Carriers are able to accept shipments through a mobile app, letting them grow and manage their business with ease.

Our team has deep experience in building great products and companies. We know success, we know failure; we have built platforms from scratch, we have deal with large legacy systems. We care about each other and about the product and services we are building.

Ontruck is backed by the top investors in Europe.

To learn more, visit www.ontruck.com.

Become a Data Engineer at Ontruck

At Ontruck data engineering focuses on practical applications of data collection and analysis. They take care of the mechanisms for collecting and validating large sets of information.

What we do

Build, maintain and evolve our data platform
Collect data from internal and external sources
Transform the data in usable and understandable formats
Load this data into controlled areas allowing other teams to use

What we do NOT do

Analyze data to provide the business team with data-driven insight
Create or train machine learning models
Create features in other Ontruck products
Validate or invalidate analysis or experiment with data analysis

What the product Data Engineering creates

At Ontruck, Data Engineering exposes a data lake to the Data Analytics and Data Science teams. The data lake is meant to be a place of discovery for these teams. Since the data is raw, it takes less work for the Data Engineering team to manage, but it doesn’t eliminate data that could be useful for skilled explorers.

More broadly within Ontruck, Data Engineering exposes a data warehouse of tables that are structured to be queried quickly and only contain a subset of all the data in the lake. For Ontruck, all of our data goes through the lake before it gets to the warehouse, and only the data that we know is useful and worth cleaning gets to the warehouse.

These tables are meant to be more easily understood and allow for varying levels of access to sensitive data through different schemas.

The warehouse has been cleaned for us; it’s in tables that make sense for known use cases, and you can get answers out of it quickly.

What you will be doing day to day:

Maintain, expand and improve our data processes and computational infrastructure. At Ontruck we use a mix of tools in constant evolution; Streamsets, Tableau, Airflow, Superset, Druid, Spark, MLeap, Tensorflow.
Develop our core data infrastructure carrying out and reporting on proofs of concept over different tools and ideas
Responsible of initiatives related to building, maintaining and orchestrating all the components in our data platform.
Working with different sources of data (internals and externals) to process and transform them and make them accessible for other team

Requirements

Your skills and experience:

Degree in Computer Science or related technical field
At least 3 years of development experience in Python (or any other object-oriented language applied to process data)
Experience as a Data Engineer or related specialty (e.g., Software Engineer, Business Intelligence Engineer, Data Scientist) with a track record of manipulating, processing, and extracting value from large datasets
Demonstrated strength in data modeling, scalable ETL development, and data warehousing. We use StreamSets, Airflow and Superset.
Strong experience in optimising and performance tuning of Postgres. Hands-on experience in configuring and supporting replication and background in building large database infrastructure supporting a high volume of transactions in a high-demand environment
Hands-on experience with container orchestration using Kubernetes and Docker
Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
Experience building data products incrementally and integrating and managing datasets from multiple sources
Understanding of engineering best practices: write tests, use automation, build continuous integration pipelines, etc
Experience with data streaming platforms (Spark, Kafka, Kinesis, etc.) would be an advantage!

Other Relevant information

Opportunities for personal growth and learning, every single day.
A flat, laid-back culture. Everybody is encouraged to participate in discussions and contribute.
High-trust environment. We believe in giving autonomy to all our employees.
Competitive compensation packages. We are looking for the very best talent, and will reward accordingly.
Awesome offices in central Madrid. We are easily accessible by public transport, as well as close to public bike stations.
Flexible schedule.

Leer la descripción completa

Acerca de Ontruck

Sitio web
http://www.ontruck.com
Industria
Logistics
Sede central
Madrid, Spain
Tamaño de la compañía
50-200
Fundada
2016