When implementing FHIR systems in healthcare organizations, tracking data sources becomes crucial for compliance, auditing, and user transparency. A recent technical discussion we had with a major healthcare organization highlights a common dilemma: how to properly categorize medical resources by their data origin.
For our Hebrew speaking crowd, this blog post also has a Hebrew version.
The Challenge
Healthcare organizations typically handle three types of data sources:
- Internal data created internally (e.g., clinical records from EMR systems)
- External data created internally (e.g., reports from non-integrated providers processed by internal staff)
- External data created externally (e.g., data imported through interoperability exchanges)
The question becomes: what's the best FHIR-compliant approach for tracking these origins?
What Are the Main Approaches for FHIR Data Source Tracking?
When Should You Use meta.source for Data Tracking?
The meta.source element sits directly on each FHIR resource as a URI identifying the data source.
Advantages:
- Simplicity: Data source information travels with the resource itself
- Immediate access: Consuming applications get source information without additional queries
- Lightweight: No separate resources to maintain
Disadvantages:
- URI complexity: Requires defining and maintaining URIs for each data source type (e.g., URIs for "internal EMR data," "external provider created internally," "imported external data")
- URI interpretation challenges: URIs can be cumbersome to interpret - not immediately clear what each URI represents
- Single source limitation: 0..1 cardinality means only one source can be recorded, but data often has multiple sources over its lifecycle
- Evolution challenges: As data passes through multiple systems and gets updated from various sources, the single source model becomes unrealistic
- Semantic ambiguity: "Source" can mean different things - original data source vs. who created this specific resource instance
Best for: Organizations with simple, stable data flows where resources have clear, singular origins.
When Is Provenance the Right Choice for Your Organization?
The Provenance resource creates separate records linked to resources via references, documenting the complete lifecycle of data creation and modification.
Advantages:
- Detailed tracking: Full audit trail of who did what, when, and why
- Multiple actors: Can document complex workflows involving multiple systems and users
- Privacy control: Sensitive provenance information can be restricted without affecting the primary resource - you may not want to expose internal data sourcing details to external consumers
- Temporal precision: Timestamps and detailed activity records
- Lifecycle documentation: Can track all actions performed on a resource throughout its entire lifecycle
- Flexible sourcing: Can document multiple data sources and their relationships comprehensively
- Separate resource model: Doesn't burden external data consumers with internal provenance details they may not need
Disadvantages:
- Complexity overhead: Requires maintaining additional resources and relationships
- Performance impact: Additional queries needed to retrieve provenance information, as well as during write/update operations on the Provenance resource
- Overkill potential: May be unnecessarily complex for simple use cases
- Maintenance burden: Requires keeping provenance resources updated as data changes, which requires automation
- Infrastructure requirements: Need systems capable of automatically maintaining provenance records
Best for: Organizations requiring detailed audit trails, complex multi-system workflows, or strict compliance requirements.
Are Extension-Based Solutions Right for Your Use Case?
Custom extensions can be defined to add structured data source information directly to resources without the constraints of existing elements.
Advantages:
- Structured data: Unlike URI-based approaches, can include rich metadata (source type, confidence level, processing date)
- Multiple sources: Can handle complex scenarios where data has multiple origins
- Flexibility: Tailored exactly to organizational needs
- Performance: Information travels with the resource like meta.source, meaning no extra queries
Disadvantages:
- Interoperability: Custom extensions may not be understood by external systems
- Maintenance: Requires ongoing governance and documentation
- Standardization risk: May diverge from future FHIR standards
Best for: Organizations with specific data source tracking requirements that don't fit standard elements, or those needing structured categorization beyond simple URIs.
How Do You Choose the Right FHIR Data Source Tracking Method?
Choose meta.source when:
- Data sources are clearly defined and stable
- Simple categorization meets your needs
- Performance and simplicity are priorities
- Audit requirements are minimal
Provenance when:
- Data passes through multiple systems
- Detailed audit trails are required
- Privacy considerations around data origins exist
- Compliance mandates comprehensive tracking
- Future scalability is a concern
Choose Extension-based solutions when:
- Standard elements don't match your data model
- You need structured categorization beyond simple URIs
- Multiple data source attributes must be tracked
- Interoperability with external systems is limited
- Organizational-specific requirements exist
What Should You Consider When Implementing FHIR Data Source Tracking?
Standards Alignment
Before implementation, check with your national FHIR implementation guides for specific guidance on data source tracking, as some jurisdictions may provide standardized approaches.
Hybrid Approaches
Some organizations use multiple approaches: meta.source for basic categorization, extensions for structured metadata, and Provenance for detailed audit trails on sensitive data flows.
Migration Planning
Consider how your chosen approach will evolve. It is possible to migrate extensions to standard elements as FHIR evolves, while custom solutions may require more significant changes.
Automation
Regardless of approach, ensure your systems can automatically populate source information. Manual tracking inevitably leads to inconsistencies and gaps.
Conclusion
The choice between meta.source, Provenance, and extension-based solutions depends on your organization's specific needs, complexity, and future requirements. Start with your use cases: if you need simple categorization for user interface hints, meta.source may suffice. If you need structured data source information with multiple attributes, consider extensions. If you're building for compliance, audit trails, or complex multi-system environments, invest in Provenance.
Remember that this decision affects not just technical implementation but also governance, privacy, and user experience across your entire healthcare ecosystem.
This analysis is based on real-world implementation discussions and FHIR R4 specifications. Always consult current FHIR documentation and your national implementation guides for the most up-to-date guidance.