Multi-Modal Large Models for document understanding

Thursday. February 01, 2024 - 1 min

Large Language model for document understanding

A novel approach for document name entity recognition

This project introduces an innovative approach for automating the parsing of unstructured text from searchable PDF invoices into structured schemas (name entity recognition). Traditional methods rely heavily on manual efforts and rigid templates, leading to inefficiencies and a lack of scalability. We leveraged Transformer networks to enhance operational efficiency and data accuracy by developing a machine learning algorithm capable of understanding and structuring invoice data without predefined rules. Utilizing pre-trained Transformer networks, combined with techniques for both sequential and simultaneous field extraction, we aim to significantly reduce manual data entry and improve the reliability of database information. The project is set to explore the integration of cloud infrastructure for scalability, a Human in the Loop system for validation and quality assurance (QA).

Deliverables include a trained ML algorithm, comprehensive documentation, and integration strategies for real-world applications, aiming to set new standards for document processing automation.

Rafa Felix

PhD, that climbs and enjoy long distance rides.

Multi-Modal Large Models for document understanding

Large Language model for document understanding

A novel approach for document name entity recognition

Related Posts

Rafa Felix