Data Verification
Ar.io gateways verify that retrieved and cached data matches what was committed to Arweave. Verification helps users receive authentic, uncorrupted data without trusting a single gateway operator.
Gateway data verification is one layer of ar.io's broader verification architecture. For how gateway verification composes with signed response claims, client-side verification, and OIP accountability, see Verification and Accountability.
How Gateways Verify Data
Data verification uses Arweave data roots, hashes, and Merkle proofs to check that cached data matches what was originally stored. A gateway can verify data before serving it or re-import data when verification fails:
The Verification Workflow:
At a high level, verification moves through discovery, retrieval, cryptographic computation, comparison, and recovery:
1. Discovery Phase
- Periodically scan for unverified data items
- Priority-based queue management (higher priority items first)
- Track retry attempts for failed verifications
2. Data Retrieval
- Fetch data attributes from gateway storage
- Retrieve the complete data stream
- Gather metadata needed for verification
3. Cryptographic Computation
- Calculate Merkle data root from actual data stream
- Generate cryptographic proofs using the same algorithm as Arweave
- Create verifiable hash chains
4. Root Comparison
- Compare computed root against indexed root in database
- Verify data hasn't been corrupted or altered
- Validate chunk integrity against Merkle proofs
5. Action Based on Results
- Success: Mark data as verified with timestamp
- Failure: Trigger re-import from Arweave or unbundle from parent
- Error: Increment retry counter and requeue for later
Verification Types
Ar.io gateways handle different types of data verification based on the data's origin:
Transaction Data Verification
For individual Arweave transactions:
- Direct root validation against transaction data roots stored onchain
- Complete data reconstruction from chunks to ensure availability
- Cryptographic proof that data matches what was originally stored
Bundle Data Verification
For ANS-104 data bundles (collections of data items):
- Bundle integrity checks to verify the container is valid
- Individual item verification within each bundle
- Recursive unbundling when verification fails to re-extract items
- Nested bundle support for bundles containing other bundles
Chunk-Level Validation
At the most granular level:
- Merkle proof validation for individual data chunks
- Sequential integrity ensuring chunks form complete data
- Parallel verification of multiple chunks for performance
Why Verification Matters
Cryptographic Trust Foundation
- Mathematical Proof: Merkle tree cryptography provides irrefutable proof of data integrity
- Independent Validation: Multiple gateways verify the same data independently
Data Integrity Guarantees
- Tamper Detection: Any alteration to data is immediately detectable
- Corruption Recovery: Automatic healing of corrupted data through re-import
Gateway Reliability
- Continuous Monitoring: Ongoing verification catches issues before users encounter them
- Self-Healing System: Automatic recovery mechanisms maintain data availability
- Transparent Operations: Verification status and timestamps provide audit trails
Explore Gateway Systems
Data Retrieval
Learn how gateways fetch data from multiple sources with verification
Gateway Architecture
Understand the technical architecture behind verification systems
Run Your Own Gateway
Set up a gateway with built-in verification capabilities
Gateway Configuration
Configure verification settings and optimization options
How is this guide?