Testing & Validation
Proper testing and validation of your ontology ensures that it works correctly with real data and produces high-quality knowledge graphs. This guide covers validation techniques, testing strategies, and debugging approaches.Validation Levels
Graphora provides multiple levels of validation to ensure ontology correctness:Syntax Validation
YAML syntax correctness and basic structure validation
Schema Validation
Entity references, property types, and relationship integrity
Quality Validation
Quality rule testing with sample data and scoring validation
Integration Testing
End-to-end testing with real document transformation
Syntax Validation
The first level of validation ensures your YAML is correctly formatted and follows the expected schema.Common Syntax Errors
Invalid YAML Syntax
Invalid YAML Syntax
Problem: Malformed YAML structureSolution: Proper YAML formatting
Missing Required Fields
Missing Required Fields
Problem: Missing mandatory sectionsSolution: Include all required sections
Invalid Property Types
Invalid Property Types
Problem: Unsupported property typeSolution: Use supported types
Validation API
Use the validation endpoint to check syntax before registration:Schema Validation
Schema validation ensures that entity references, relationships, and constraints are logically consistent.Reference Integrity
All relationship targets must reference defined entities:Constraint Validation
Validate that property constraints are logically consistent:Relationship Cardinality
Ensure cardinality specifications make sense:Quality Rule Testing
Test your quality rules with sample data to ensure they work as expected.Creating Test Data
Prepare test data that covers different scenarios:Quality Testing Script
Create a comprehensive test script:Integration Testing
Test your ontology with real document transformation to ensure end-to-end functionality.Document-Based Testing
Validation Best Practices
1. Comprehensive Test Coverage
Test different types of data and edge cases:2. Iterative Refinement
Use test results to refine your ontology:3. Performance Testing
Monitor ontology performance with large datasets:Debugging Common Issues
Quality Score Too Low
Symptoms: Consistently low quality scores across different documents Debugging Steps:- Check violation patterns: Identify most common rule violations
- Review test data: Ensure test data matches expected real-world data
- Adjust thresholds: Consider if quality rules are too strict
- Examine extraction confidence: Low confidence may indicate ontology-data mismatch
No Entities Extracted
Symptoms: Transformation completes but extracts no entities Possible Causes:- Ontology too specific for document content
- Missing required properties preventing entity creation
- Document format not supported
- Extraction confidence threshold too high
Relationship Cardinality Violations
Symptoms: Relationships not created due to cardinality constraints Debugging:Automated Testing Pipeline
Set up automated testing for continuous validation:python -m unittest test_ontology.py
Validation Checklist
Before deploying your ontology to production:Syntax Validation
✅ YAML syntax is correct
✅ All required fields present
✅ Property types are supported
✅ No structural errors
✅ All required fields present
✅ Property types are supported
✅ No structural errors
Schema Validation
✅ All relationship targets reference defined entities
✅ Property constraints are logically consistent
✅ Cardinality specifications make sense
✅ No circular references in required properties
✅ Property constraints are logically consistent
✅ Cardinality specifications make sense
✅ No circular references in required properties
Quality Rule Testing
✅ Quality rules tested with valid data
✅ Quality rules tested with invalid data
✅ Thresholds appropriate for use case
✅ Common violations identified and addressed
✅ Quality rules tested with invalid data
✅ Thresholds appropriate for use case
✅ Common violations identified and addressed
Integration Testing
✅ End-to-end testing with real documents
✅ Performance acceptable for expected load
✅ Error handling works correctly
✅ Results meet business requirements
✅ Performance acceptable for expected load
✅ Error handling works correctly
✅ Results meet business requirements
Next Steps
Deploy to Production
Learn how to use your validated ontology for document transformation
Monitor Quality
Set up ongoing quality monitoring and improvement processes
