DataDome is a global cybersecurity company, specialized in protecting and accelerating digital businesses.
50% of web traffic is generated by robots. DataDome has implemented a self-adaptive bots detection and identification strategy to protect mobile websites and APIs. A functional Dashboard allows our customers to monitor non-human traffic on their sites and implement a blocking and filtering strategy according to the company's security and business requirements.
Our technical Stack is mainly composed of Java for our realtime detection layer, a low latency Stream Engine in Scala running on Flink, ElasticSearch for the storage, Symfony 4 & Angular 7 for our dashboards.
We operate at scale and handle over 2 billion hits per day giving response in less than 3ms (99 percentile). Currently we are present in more than 12 datacenters around the world, deployed using Docker.
Descriptif du poste
We are looking for a DataScientist who will have to meet the technical challenge, design and develop our current solutions.
As a Data Scientist/Engineer in a typical day, you might:
- Extract sample from our large trafic DataSets
- Look for new features to implement
- Use our feedback loop to update model close to realtime
- Move batch detection to RealTime stream analyses
- Read state to the art research paper
- You have been working on ML for 3+ years
- You have a strong programming background in Python, Java or Scala
- Great experience working in Unix/Linux environments
- You care about code quality, simplicity and performance
- You have a BS/MS/PhD in a scientific field or equivalent experience
- You are more interested in R&D than write thousands of lines of code
- You can read a research paper and implement it
- You are not afraid of looking at large datasets
- You've worked at high scale with systems like Apache Kafka, Apache Flink or ElasticSearch
- You wrote your own crawler once or twice before
- You understand how internet works