Description
Data extraction is a critical component of document processing in Grooper. If nothing else, there's some data set you want to collect from your document. For each field present on the document you want, you'll need to create and configure a data extractor. But there is so much more you can do with extractors! Any time you need to use text on the document for some Grooper activity, you're going to need an extractor. Data extractors can be used to:
• Collect field data from documents
• Classify documents
• Separate documents
• Redact text on a document
• And much more (There are over 100 different Grooper objects you could conceivably configure an extractor!!)
Extractors are so important in Grooper because they simulate how you, a human, reads and understands a document. How do you know you're looking at an invoice as opposed to a HR benefits enrollment form? By reading it, looking for patterns and words that are common to one and not the other. How do you find the invoice number on that invoice? By reading it, looking for labels next to something that reads like an invoice number. Extractors work much the same way to much the same ends. They are a tool that automates the logic to read and understand documents that is so intuitive to a human reader.
That said, there's not just one way a human reads and understands a document. Rather, there are multiple ways people do that, understanding patterns in text data, using context clues provided by certain words, and even just analyzing where things are physically on the page. There's not just a single one-size-fits-all extractor either. Rather, there are multiple tools in Grooper's data extraction toolkit, each one with its own set of configurations and internal logic, designed to best target certain ways data is organized and presented on a document.
This course aims to educate you on the plethora of data extraction tools in Grooper. We will detail the extractor objects and extractor types available and how to configure them. We will also touch briefly on how these are used in the real world of document processing. This course is a critical prerequisite for all other Consultant courses, as each of these tools will pop up throughout later coursework.
BONUS!!! Because data extraction is so foundational to most of Grooper's document processing, you will be able to take a second course, Data Extraction Techniques FOR FREE after completing this course. As long as you complete Data Extraction 101, you will be able to enroll in Data Extraction Techniques at no additional cost.
read more