All student projects

Preprocessing OCR features from document extraction (with Klippa Groningen)

About the project

The Klippa platform allows its users to automate their business processes. Klippa has multiple services focused on extracting data from documents and returning these in a structured format.

On top of this, Klippa provides the service of no code automation, the Klippa flow builder. It allows our users to connect standard automation to our Klippa services. It is similar to products like Zapier or To get good structured data as output means that the input needs to be good as well. The input is often not in the control of our clients, as their clients (end user) are sending in documents. These can be pdfs, but also images of documents made by their mobile device.

The assignment

As a platform user, I want to be able to select preprocessing actions to be placed in front of our document extraction services, so I get better in and output. 1) Look into preprocessing options and advice on which ones are beneficial to Klippa 2) Development of preprocessing tools for document processing 3) Adding these to our flowbuilder


SEARCH Group • University of Groningen • 2023
Some graphics by Font Awesome, Icons8, and Vectors Market.