There has been some interest and demand around using IPFS + Arweave together, below is a proposal for how we can do this:

Definitions

There are two main forms of interoperability that could be considered with IPFS:

  1. Make it possible to load existing data stored with Arweave (and identified with Arweave identifiers) using IPFS tooling
    1. An example might be a browser plugin/translator that given some Arweave identifier ar://<arID> it is possible to load an equivalent ipfs://<cid-and-path> that can fetch the data from a local kubo node hosting a copy of the data which could help as a performance optimization, if the user is offline, if public arweave gateways are blocked, …
  2. Make it possible to take existing data stored with IPFS compatible systems and store it with Arweave such that it is also retrievable using IPFS tooling
    1. An example might be to take a jpeg hosted on a local kubo node with CID bafyfoo and store it with Arweave such that it’s possible to turn off the local kubo node and still have another user’s Brave browser or a public gateway like bafyfoo.ipfs.dweb.link be able to load the jpeg

Below are some proposals for how we can do this. The proposals are based on the following assumptions/understandings from conversations with people more knowledgeable about the Arweave ecosystem than the author.

  1. It is more important to handle storing existing IPFS data with Arweave (interop option 2)
  2. Practically all retrieval by end users in the Arweave ecosystem happens through Arweave gateways rather than miners today, so continuing that trend for IPFS interop is fine (i.e. no need to change the software miners run)

Proposal

When indexing the chain (or receiving chain updates) and seeing a new data item:

  1. Check if there is an Arweave Tag (e.g. IPFS-CAR, or reuse Content-Type tag and existing IPFS types from IANA: application/vnd.ipld.car and application/vnd.ipld.raw)
  2. If so, read through the CAR validating each block and add an entry to a local database mapping multihash (the relevant portion of the CID) → the data
    1. Note: the way to make this most cost effective will depend on the Arweave gateway infrastructure. It might be reasonable to store mappings to where data is inside the CAR so it can easily be served in the same way
  3. For serving via an Arweave gateway:
    1. On inbound requests to an Arweave gateway if they’re IPFS requests send them to the IPFS handler
      1. IPFS requests look like *.ip(f|n)s.gateway.tld or gateway.tld/ip(f|n)s/*
    2. Use existing IPFS tooling hooked up to the database to answer gateway requests
  4. For serving data to the p2p network (e.g. Brave, local kubo nodes, ipfs.io, etc.)
    1. Advertise to data to IPNI and/or DHT
    2. Serve data:
      1. Short term: setup a system for serving data via Bitswap backed by the gateway infrastructure’s storage
      2. Medium term: Just reuse the same infrastructure as the for serving via the Arweave gateway since it can support https://specs.ipfs.tech/http-gateways/trustless-gateway/
        1. Depending on how valuable you’d find this we can figure out the timeline for when this will be doable such that you wouldn’t need Bitswap support.

Alternatives

There are a number of options we can look at here if there’s some more native Arweave-like integration desired, or there’s some optimizations you’re worried about (e.g. if you want to serve the data using IPFS identifiers and Arweave identifiers backed by the same bytes and want to save on UnixFS processing time with minimal caching). However, the initial proposal seems like a good way to get the conversation started.