Data Model

Because of Arweave's permanent and immutable nature, traditional file structure operations such as renaming and moving files or folders cannot be accomplished by simply updating on-chain data. ArFS works around this by defining an append-only transaction data model based on the metadata tags found in the Arweave Transaction Headers.

This model uses a bottom-up reference method, which avoids race conditions in file system updates. Each file contains metadata that refers to the parent folder, and each folder contains metadata that refers to its parent drive. A top-down data model would require the parent model (i.e. a folder) to store references to its children.

These defined entities allow the state of the drive to be constructed by a client to look and feel like a file system:

Drive Entities contain folders and files
Folder Entities contain other folders or files
File Entities contain both the file data and metadata
Snapshot entities contain a state rollups of all entities' (such as drive, folder, file and snapshot) metadata within a drive

Entity Relationships

The following diagram shows the high level relationships between drive, folder, and file entities, and their associated data. More detailed information about each Entity Type can be found in the ArFS specification documentation.

As you can see, each file and folder contains metadata which points to both the parent folder and the parent drive. The drive entity contains metadata about itself, but not the child contents. So clients must build drive states from the lowest level and work their way up.

Metadata Format

Metadata stored in any Arweave transaction tag will be defined in the following manner:

{ "name": "Example-Tag", "value": "example-data" }

Metadata stored in the Transaction Data Payload will follow JSON formatting like below:

{
  "exampleField": "exampleData"
}

Fields with a ? suffix are optional.

{
  "name": "My Project",
  "description": "This is a sample project.",
  "version?": "1.0.0",
  "author?": "John Doe"
}

Enumerated field values (those which must adhere to certain values) are defined in the format "value 1 | value 2".

All UUIDs used for Entity-Ids are based on the Universally Unique Identifier standard.

There are no requirements to list ArFS tags in any specific order.

Building Drive State

To construct the current state of a drive, clients must:

Query for all entities associated with a specific Drive-Id
Sort by block height to establish chronological order
Process entities bottom-up starting with files and folders
Build the hierarchy by following parent-child relationships
Handle conflicts by using the most recent entity version

Generate unique UUID for entity
Create metadata transaction with required tags
For files: create separate data transaction
Upload to Arweave network

Updates

Create new entity with same ID
Update metadata as needed
Upload new transaction
Client processes both versions and uses latest

Deletion

Mark entity as hidden (isHidden: true)
Upload new transaction
Entity remains in history but hidden from UI

Data Integrity

ArFS ensures data integrity through:

Immutable transactions - Once uploaded, data cannot be modified
Cryptographic signatures - All transactions are signed by the owner
Version tracking - Multiple versions of entities can exist
Conflict resolution - Clients use block height and timestamps to resolve conflicts

Performance Considerations

For large drives, consider these optimization strategies:

Use snapshots for quick state reconstruction
Implement caching for frequently accessed data
Batch operations when possible
Query by date ranges to limit data transfer

Next Steps

Now that you understand the ArFS data model, learn how to work with it:

Privacy & Encryption - Secure your data with private drives
Creating Drives - Start building with ArFS
Reading Data - Query and retrieve your data

How is this guide?

Data Model

Entity Relationships

Metadata Format

Building Drive State

Example Drive State Construction

Entity Lifecycle

Creation

Updates

Deletion

Data Integrity

Performance Considerations

Next Steps

On this page