Who are we looking for?
We are looking for a Data Engineer/Data Scientist for a project of automating the processing of accounting documents. We work with searchable PDF documents, scanned documents and photos, including handwritten text:
- Automating accounting processes: Extract key information from documents and making autonomous registrations
- Document validation and detecting anomalies on a company level for a better financial awareness of our customers
- Combining multiple data sources and historical data to get contextual understanding of documents.
- Data munging – preferably with Python
- Machine learning: Python, Keras, Tensorflow, Seldon for AI model deployment
- Google Vision for OCR
- Kubernetes, Docker, Prometheus for monitoring
- GCP (Cloud Pub/Sub, Cloud SQL, Cloud Task Queue, GKE, Cloud Storage, Google Functions)
Our technical approach:
- Machine learning for various Classification & Clustering tasks on document data
- Neural networks / Deep learning on scanned documents and other images
- Natural Language Processing
- OCR (Optical Character Recognition) for text extraction
- Algorithms and heuristics that learn from previous mistakes and improve with every new document processed