Document Parser Agent
Agent ID: document-parser-agent
Task Description:
Utilizes OCR and AI models to parse PDFs, images, and other document formats to extract text and identify key-value pairs or tables.
Reusability:
Agent Type:
Simple AI (Prompt-driven)
Implementation Notes:
Genkit flow with Gemini (or similar model with multimodal capabilities) for OCR and information extraction. Can handle various layouts.
Last Evaluated
2024-07-28
Accuracy
95.0%
Latency
avg 1.8s/doc
Cost Per Interaction
$0.009/doc
General Evaluation Notes
Accuracy depends on document quality and layout complexity.
Identity (Core A2A):
- Name: Document Parser Agent
- Unique ID:
document-parser-agent
Primary Function (A2A):
Utilizes OCR and AI models to parse PDFs, images, and other document formats to extract text and identify key-value pairs or tables.
Defined Agent Skills (A2A Interface):
Skill Orchestration & Execution
The operational logic for this agent, including the invocation and management of its defined skills, is handled by the backend system's orchestration layer. This layer is responsible for the agent's execution sequence, data handling according to its skill schemas, and any necessary interactions with tools or external services. The specific method (e.g., AI model call, deterministic code execution) is detailed in the 'Implementation Notes' within the Agent Overview.
{ "id": "document-parser-agent", "name": "Document Parser Agent", "description": "Extracts text and structured information from documents.", "isReusable": true, "taskDescription": "Utilizes OCR and AI models to parse PDFs, images, and other document formats to extract text and identify key-value pairs or tables.", "icon": { "displayName": "FileText" }, "agentType": "ai-simple", "implementationNotes": "Genkit flow with Gemini (or similar model with multimodal capabilities) for OCR and information extraction. Can handle various layouts.", "responsibleTeamIds": [ "team-ai-core" ], "skills": [ { "id": "parse-document-content", "name": "Parse Document Content", "description": "Extracts text and structured data from a document.", "inputSchemaExample": "{\n \"properties\": {\n \"documentReference\": {\n \"type\": \"string\"\n },\n \"documentTypeHint\": {\n \"type\": \"string\"\n }\n }\n}", "outputSchemaExample": "{\n \"properties\": {\n \"extractedText\": {\n \"type\": \"string\"\n },\n \"structuredData\": {\n \"type\": \"object\"\n }\n }\n}" } ], "evaluation": { "lastEvaluated": "2024-07-28", "accuracy": 0.95, "latency": "avg 1.8s/doc", "costPerInteraction": "$0.009/doc", "notes": "Accuracy depends on document quality and layout complexity." }, "inputs": [ "gds-raw-document-data" ], "outputs": [ "gds-parsed-document-content" ] }