Guile-ERIS

This document describes Guile-ERIS version 1.1.0.

The Encoding for Robust Immutable Storage (ERIS) is an encoding of arbitrary content into a set of uniformly sized, encrypted and content-addressed blocks as well as a read capability. The content can be reassembled from the encrypted blocks only with this read capability. The read capability can be encoded as an URN allowing encoded content to be referenced from existing applications.

For more information on ERIS see also the specification document.

Guile-ERIS is a Guile Scheme implementation of ERIS.

1 Introduction
- 1.1 Contact
2 Overview and Concepts
3 Encoding and Decoding Content
4 Transport and Storage of Blocks
- 4.1 CoAP
- 4.2 Utilities
  - 4.2.1 Cache
  - 4.2.2 Lookahead
Index

1 Introduction

A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.

— Leslie Lamport

Unavailability of content is a major cause for reduced reliability of distributed systems.

Availability can be increased by caching content on multiple peers. However most content on the Internet is identified by its location. Caching location-addressed content is complicated as the content receives a new location when cached.

Caching content-addressed content and making it available redundantly is much easier as the content is completely decoupled from any physical location. Integrity of content is automatically ensured with content-addressing (when using a cryptographic hash) as the identifier of the content can be recomputed to check that the content matches the requested identifier. However, naive content-addressing has certain drawbacks:

Large content is stored as a large chunk of data. In order to optimize storage and network operations it is better to split up content into smaller uniformly sized blocks and reassemble blocks when needed.
Unencrypted: Content is readable by all peers involved in transporting, caching and storing content.

ERIS addresses these issues by splitting content into small uniformly sized and encrypted blocks. These blocks can be reassembled to the original content only with access to a short read capability, which can be encoded as an URN.

Encodings similar to ERIS are already widely-used in applications and protocols such as GNUNet, BitTorrent, Freenet, Gnutella, Direct Connect, IPFS and others. However, they all use slightly different encodings that are tied to the respective protocols and applications. ERIS defines an encoding independent of any specific protocol or application and decouples content from transport and storage layers.

Blocks of encoded content can be transported over various protocols and stored on various mediums. Implementation and bindings to transport layers is (currently) beyond scope of this library. See Transport and Storage of Blocks for some ideas and pointers.

See the specification document for an in-depth description of the encoding. A short overview is provided in this manual (see Overview and Concepts).

1.1 Contact

The source code of Guile-ERIS is avilable on Codeberg.

A mailing list for general discussion on ERIS is available at ~pukkamustard/eris@lists.sr.ht. Ephemeral discussions take place in the #eris channel on the Libera IRC network. See also the project page for more information.

Please feel free to direct any questions or comments to the mailing list or to the IRC channel. You are also invited to share your applications and use-cases.

2 Overview and Concepts

ERIS splits content into uniformly sized and encrypted blocks (leaf nodes). References and keys to blocks are collected in intermediary nodes. Nodes are encrypted and references are recursively collected in higher level nodes until a single root node remains.

To decode content we need:

The reference to the root node
The key to decrypt the root node
The level of the root node

This information is encoded in the read capability. The read capability can be encoded as a binary string or as a URN (see Working with Read Capabilities).

2.1 Block Size

ERIS allows two fixed block sizes:

small (1 KiB): Allows efficient encoding of small pieces of content (e.g. meta-data or short messages). It is recommended to use the small block size for content smaller than 16 KiB.
large (32 KiB): Recommended for content larger than 16 KiB. This reduces the storage overhead as internal nodes can hold references to more nodes and is faster as cryptographic operations can be run on larger pieces of data.

2.2 Convergent Encryption

Deterministic convergent encryption allows deterministic identifiers of content. However, it also suffers from two known attacks that might allow adversaries confirm that certain files have been decoded (the Confirmation of a File Attack) and in some cases even learn the encoded content (The Learn-The-Remaining-Information Attack).

The solution to both attacks is to use a convergence secret. A convergence secret adds some entropy to the encoding process. Using the same convergence secret results in the same encoding and the same identifiers for the same content. Using different convergence secrets results in different encodings and different identifiers for the same content.

The convergence secret allows a flexible level of protection against the known attacks against deterministc convergent encryption:

Random and unique convergent encryption for every piece of content: This allows the highest level of protection against the known attacks.
Shared convergence secret: A group may use a shared convergence secret. This allows members of the group to use deterministic identifiers and de-duplicate content while being secure to the attacks against convergent encryption from adversaries that do not know the convergence secret.
Fixed public convergence secret: For some applications de-duplication and determinstic identifiers are more important than being safe from the attacks against convergent encryption. In that cases a fixed and publically known convergence secret can be used. The %null-convergence-secret (see Encoding and Decoding Content) may be used as such a publically known convergence secret.

The convergence secret is not required when decoding content.

For more information please consult the ERIS specification.

2.3 Security Considerations

ERIS is designed to increase the availability of content. ERIS is NOT designed for secure communication.

Please read the specification for detailed security considerations.

3 Encoding and Decoding Content

(use-modules (eris))

Functionality to encode and decode content using ERIS is provided in the (eris) module.

Note that the version of the ERIS specification implemented is different than the version of this library. The variable %eris-spec-version holds the version of the ERIS specification implemented by this library:

Scheme Variable: %eris-spec-version ¶: The version of the ERIS specification implemented.

3.1 Encoding

Some content that is available in a string, a bytevector or in an input port can be encoded with ERIS. The output is a set of content-addressable blocks and a read capability.

Guile-ERIS does not make any assumptions about how to store or transport blocks (see Transport and Storage of Blocks). The encoding procedures take a SRFI-171 reducer (see SRFI-171 Reducers in GNU Guile Reference Manual) that stores blocks. The reducing procedure is called with no arguments to return an initial state, with two arguments (the current state, and the input block) when blocks should be stored and with a single argument (the final state) when encoding is completed.

Reducers from SRFI-171 can be used as is. For example rcons can be used to create an association list from reference to block or rcount can be used to count the number of blocks encoded. Specialized reducers are provided by this library (see Transport and Storage of Blocks).

For conveniences, the null convergence secret is provided as a variable:

Scheme Variable: %null-convergence-secret ¶: A bytevector of size 32 bytes with all bytes set to zero. Using the %null-convergence-secret as convergence secret enables deterministc convergent encryption. Users should be aware of the security implications of using deterministc convergent encryption (see Convergent Encryption).

The most direct encoding procedure is eris-encode, which can be used to encode some content in one go:

Scheme Procedure: eris-encode readable #:block-size block-size #:block-reducer block-reducer #:convergence-secret convergence-secret ¶

Encode content from the port, string or bytevector readable and reduce blocks into block-reducer. Returns two values: the read capability as URI string and the reduced blocks.

readable is either an input port, a string or a bytevector holding the content to be encoded.

block-size specifies the block size that should be used to encode the content (see Block Size). Valid values are 'small and 'large.

block-reducer specifies a SRFI-171 reducer that is used to store the blocks . The values input to the reducer are pairs consisting of the reference to the block (as bytevector) and the block itself (as bytevector). The reference to the block is always the Blake2b hash of the block. By default the reverse-rcons is used, which stores blocks in a list.

convergence-secret specifies the convergence secret as bytevector of size 32 bytes (see Convergent Encryption).

Example:

(use-modules (eris)
	     (sodium generichash)
	     (rnrs base)
	     (rnrs bytevectors))

(define (blake2b-256 bv)
  (crypto-generichash bv #:out-len 32))

(define my-block-reducer
  (case-lambda
    ;; Initialization
    (() '()) ; just return an empty list

    ;; Completion
    ((result) result) ; nothing to do

    ;; Input
    ((result ref-block)
     (let ((ref (car ref-block)) ; the reference to the block
	   (block (cdr ref-block))) ; the block itself

       (assert (bytevector=? ref (blake2b-256 block)))
       (assert (equal? (bytevector-length block) 1024))

       ;; add the ref-block to the result
       (cons ref-block result)))))

(eris-encode "Hello world!"
	     #:block-size 'small
	     #:block-reducer my-block-reducer
	     #:convergence-secret %null-convergence-secret)

;; => "urn:urn:eris:BIAD77QDJMFAKZYH2DXBUZYAP3MXZ3DJZVFYQ5DFWC6T65WSFCU5S2IT4YZGJ7AC4SYQMP2DM2ANS2ZTCP3DJJIRV733CRAAHOSWIYZM3M"
;; => ( ... ) ; list of blocks

Note that my-block-reducer is exactly reverse-rcons from SRFI-171 (without the asserts). The initialization and completion arity could be used for stateful setup and tear-down of a block storage (e.g. a database connection).

Alternatively, a binary output port can be opened that will encode content that is written to it. This is useful for encoding content that is larger than the available memory or is obtained as a stream.

Scheme Procedure: open-eris-output-port #:block-size block-size #:block-reducer block-reducer #:convergence-secret convergence-secret ¶

Returns two values: a binary output port and a procedure. The latter should be called with zero arguments to obtain the read capability of all the data accumulated by the port and the the reduced blocks.

Encoded blocks will be emited eagerly. The memory requirement is logarithmic (with a large base) to the size of the input.

block-size specifies the block size that should be used to encode the content (see Block Size). Valid values are 'small and 'large.

convergence-secret specifies the convergence secret as bytevector of size 32 bytes (see Convergent Encryption).

Example:

(use-modules (eris)
	     (srfi srfi-71))

(let ((port finalize
	    (open-eris-output-port
	     #:block-size 'small
	     #:convergence-secret %null-convergence-secret)))

  (display "Hello " port)
  (display "world!" port)

  (finalize))

;; => "urn:urn:eris:BIAD77QDJMFAKZYH2DXBUZYAP3MXZ3DJZVFYQ5DFWC6T65WSFCU5S2IT4YZGJ7AC4SYQMP2DM2ANS2ZTCP3DJJIRV733CRAAHOSWIYZM3M"
;; => ( ... ) ; list of blocks

3.1.1 Encoder

(use-modules (eris encoder))

The (eris encoder) module provides an encoder object. This is a more low-level interface for encoding content. For example it allows the block reducers state to be inspected and modified while encoding. The procedures eris-encode and open-eris-output-port are implemented on top of this interface.

Scheme Procedure: eris-encoder-init #:block-size block-size #:convergence-secret convergence-secret #:block-reducer block-reducer #:identity identity ¶

Returns a new encoder object (a <eris-encoder> object).

block-size specifies the block size in bytes. Valid values are 1024 (for small block size) or 32768 (for large block size).

identity specifies the initial reducer state. If identity is not given, the block-reducer is run without arguments to return the initial reducer state.

convergence-secret specifies the convergence secret as bytevector of size 32 bytes (see Convergent Encryption).

Scheme Procedure: eris-encoder obj ¶: Returns true if obj is an encoder object.

Scheme Procedure: eris-encoder-update encoder bv ¶

Add bv to the content being encoded and return the updated encoder.

Content that is smaller than block-size is buffered by the encoder. For best performance add pieces of size exactly block-size to the encoder.

Scheme Procedure: eris-encoder-finalize finalize #:finalize-reducer finalize-reducer? ¶

Finalizes the encoder with all content previously added with eris-encoder-update and returns two values: the read capability (as <read-capability> object, see Working with Read Capabilities) and the reduced blocks.

If finalize-reducer? is true (default value is true) then the reducer state is finalized by running the procedure block-reducer on the final reducer state.

Scheme Procedure: eris-encoder-result encoder ¶: Returns the current state of the block reducer.

Scheme Procedure: set-eris-encoder-result! encoder vaue ¶: Sets the state of the block reducer.

3.2 Decoding

Two procedures are provided for decoding content: eris-decode and open-eris-input-port.

eris-decode can be used to decode the entire content at once:

Scheme Procedure: eris-decode read-capability #:block-ref block-ref ¶

Returns the decoded content as bytevector.

read-capability is the ERIS read capability as uri, string, bytevector (binary encoded read capability) or as a read-capability record (see Working with Read Capabilities).

block-ref is a procedure which is called to de-reference blocks. A single argument is passed with the reference of the block as bytevector (the Blake2b hash of the block).

Example:

(use-modules (eris)
	     (srfi srfi-171)
	     (rnrs bytevectors))

(let ((read-capability blocks
       (eris-encode "Hello world!"
		    #:block-size 'small
		    #:convergence-secret %null-convergence-secret)))
  (utf8->string
   (eris-decode read-capability
		#:block-ref (lambda (ref) (assoc-ref blocks ref)))))

;; => "Hello world!"

open-eris-input-port allows more fine-grained access. The encoded content is exposed on a binary input port. The binary port allows random-access to specific positions of the content, while only de-referencing the necessary blocks:

Scheme Procedure: open-eris-input-port read-capability #:block-ref block-ref ¶

Returns a binary input port that can be used to read decoded content.

read-capability is the ERIS read capability as uri, string, bytevector (binary encoded read capability) or as a read-capability record (see Working with Read Capabilities).

block-ref is a procedure which is called to de-reference blocks. A single argument is passed with the reference of the block as bytevector (the Blake2b hash of the block).

Example:

(use-modules (eris)
	     (srfi srfi-171)
	     (rnrs io ports))

(let* ((read-capability
	blocks
	(eris-encode "Hello world!"
		     #:block-size 'small
		     #:convergence-secret %null-convergence-secret))
       (port (open-eris-input-port read-capability
				   #:block-ref
				   (lambda (ref) (assoc-ref blocks ref)))))

  (get-string-n port 2)
  ;; => "He"
 
  (set-port-position! port 6)
  (get-string-all port)
  ;; => "world!"

  (set-port-position! port 0)
  (get-string-all port)
  ;; => "Hello world!"

  (port-position port))

;; => 12

Another procedure is provided offering a balance between eris-decode and open-eris-input-port:

Scheme Procedure: eris-transduce xform f read-capability #:block-ref block-ref ¶

Decodes the ERIS encoded content by reducing it with the reducer f applied to the transducer xform.

xform is an SRFI-171 transducer.

f is an SRFI-171 reducer that is fed with the decoded content as bytevectors.

read-capability is the ERIS read capability as uri, string, bytevector (binary encoded read capability) or as a read-capability record (see Working with Read Capabilities).

block-ref is a procedure which is called to de-reference blocks. A single argument is passed with the reference of the block as bytevector (the Blake2b hash of the block).

This allows decoding content in one go (like eris-decode) while allowing efficient stream-processing using transducers (like open-eris-input-port).

See also the section on transducers in the Guile Reference Manual (see SRFI-171 Reducers in GNU Guile Reference Manual).

Another advantage is that we don’t need to create any custom ports that create continuation barriers. eris-transduce is compatible with fun stuff like fibers and Spritely Goblins.

3.2.1 Decoder

(use-modules (eris decoder))

The (eris decoder) module provides a decoder object. This is a more low-level interface for decoding content. For example it allows efficient computation of the length of some encoded content while de-referencing a minimal amount of blocks. It can also be used to implement parallel decoders or look-ahead decoders (when using a caching block-ref procedure).

Decoders are purely functional. The implementation is based on Zippers (Functional Pearl: The Zipper, Gérard Huet, 1997).

Scheme Procedure: eris-decoder-init read-capability #:block-ref block-ref ¶

Returns an initialized <eris-decoder> object. Initial position is set to the beginning of the content. No blocks are de-referenced during initialization.

read-capability is the ERIS read capability as uri, string, bytevector (binary encoded read capability) or as a read-capability record (see Working with Read Capabilities).

block-ref is a procedure which is called to de-reference blocks. A single argument is passed with the reference of the block as bytevector (the Blake2b hash of the block).

Scheme Procedure: eris-decoder? obj ¶: Returns true if obj is an ERIS decoder (a <eris-decoder> object).

Scheme Procedure: eris-decoder-position decoder ¶: Returns current position of decoder as offset from the beginning of the encoded content.

Scheme Procedure: eris-decoder-seek decoder pos ¶: Returns a new decoder with position set to pos.

Scheme Procedure: eris-decoder-length decoder ¶

Returns the length of the encoded content in bytes.

The length is computed by descending the tree by the right most nodes at every level.

Scheme Procedure: eris-decoder-read! decoder target target-start len ¶: Reads at most len bytes from decoder and copies them to the bytevector target starting at offset target-start. Returns two values: The number of bytes read and a new decoder with position set to after the bytes read.

3.3 Working with Read Capabilities

(use-modules (eris read-capability))

Read capabilities hold the necessary information to decode ERIS encoded content from blocks.

Read capabilities can be encoded as binary sequences (i.e. as bytevectors in Scheme) or as URIs (URNs to be precise). The (eris read-capability) module provides functions for working with read capabailities.

Scheme Procedure: make-eris-read-capability block-size level root-reference root-key ¶: Returns a new read capability object.

Scheme Procedure: eris-read-capability-block-size read-capability ¶: Returns the block size of read-capability.

Scheme Procedure: eris-read-capability-level read-capability ¶: Returns the level of read-capability.

Scheme Procedure: eris-read-capability-root-reference read-capability ¶: Returns the root reference of read-capability.

Scheme Procedure: eris-read-capability-root-key read-capability ¶: Returns the root key of read-capability.

Scheme Procedure: eris-read-capability->bytevector read-capability ¶: Returns the binary encoding of read-capability as bytevector.

Scheme Procedure: eris-read-capability->string read-capability ¶: Returns the URN encoding of read-capability as string.

Scheme Procedure: ->eris-read-capability obj ¶: Returns a new read capability object that is parsed from obj. obj is either a bytevector containing the binary encoding of a read capability, a string containing the URN encoding, a URI. If no read capability can be parsed false is returned.

4 Transport and Storage of Blocks

Currenty we provide block reducers and block de-reference functions for accessing ERIS CoAP stores (see ERIS over CoAP) as well as utility functions for improving performance.

Other possible transport and storage layers include:

An in-memory hash-map. A reducer would be very similar to rcons as provided by SRFI-171.
An SQLite database using the guile-sqlite3 bindings.
Storing blocks in files on a file-system.
Fetching and posting blocks to an HTTP endpoint. See also ERIS over HTTP.
Using IPFS as a block storage. See ipfs.scm.

Many other block transports and storages should be possible. Get in contact and get hacking!

4.1 CoAP

(use-modules (eris coap))

There are many ways to transport blocks over the network. Using CoAP is one, and maybe not such a bad one.

See ERIS over CoAP for more information.

To store blocks in an ERIS CoAP store:

Scheme Procedure: eris-coap-block-reducer uri #:nstart nstart ¶

Returns a block reducer that stores blocks to the CoAP store reachable at the URL uri.

The nstart argument can be used to specify the number of in-flight CoAP request made. If not provided it defaults to 8.

This reducer must be invoked with a running Fibers scheduler (see Using Fibers in The Fibers manual).

To decode content from an ERIS CoAP store:

Scheme Procedure: eris-coap-block-ref uri ref #:connection connection ¶

Retrieve a block from an ERIS CoAP store reachable at uri.

An existing CoAP TCP connection can be provided with connection.

This reducer must be invoked with a running Fibers scheduler (see Using Fibers in The Fibers manual).

4.2 Utilities

4.2.1 Cache

(use-modules (eris cache))

It might make sense to cache retrieved blocks as the same content may be decoded multiple times or we are pre-fetching blocks (see Lookahead). This module provides an in-memory cache for blocks of ERIS encoded content.

Scheme Procedure: make-eris-block-cache block-ref #:capacity capacity #:workers workers ¶

Create an ERIS block cache using the underlying block reference function block-ref.

The procedure must be called with a running Fibers scheduler.

The capacity of the cache in bytes can be set with the argument capacity. The default capacity is 16MiB (512 blocks of size 32KiB).

Block requests are handled by a pool of concurrent workers. The number of workers can be set with the argument workers (defaults to 8).

Scheme Procedure: eris-block-cache? val ¶: Returns #t if val is an ERIS block cache.

Scheme Procedure: eris-block-cache-ref cache ref ¶: Try and get a block with reference ref from the block cache cache. If block is not cached the block will be fetched using the block reference function of the cache.

Scheme Procedure: eris-block-cache-ref-operation val ¶

Return an operation that when performed would request a block wit ref ref from the cache and if not available in cache from the underlying block reference function.

see Operations in The Fibers manual

Scheme Procedure: eris-block-cache-stop cache ¶: Stop the block cache.

4.2.2 Lookahead

(use-modules (eris lookahead))

Retrieving a block from a store might only be possible with considerable latency. Decoding performance will then be mostly bound by this latency as every single block requires a round-trip to the store.

We can solve this with a lookahead. The lookahead eagerly decodes ahead in parallel to the main decoder. The lookahead will retrieve blocks from the store so that when the main decoder needs them they are already available.

The lookahead we provide is implemented as a SRFI-171 transducer (see Transducers in GNU Guile Reference Manual) and can be used when decoding content with eris-transduce (see Decoding).

Scheme Procedure: teris-lookahead read-capability #:block-ref block-ref #:workers workers #:lookahead-distance lookahead-distance ¶

Returns a SRFI-171 transducer that can be used with eris-transduce to decode content ahead of the main decoder.

The procedure must be called with a running Fibers scheduler.

The read capability must be explictly passed with read-capability.

The number of lookahead workers can be set with the argument workers. If not provided it will default to 8.

The lookahead will only decode ahead up to lookahead-distance in front of the main decoder. The defaut value is (* block-size arity).

For an example see coap/client.scm.

Index

Jump to:	C R

	Index Entry	Section

C
	convergence secret:	Convergent Encryption

R
	read capability:	Overview and Concepts

Jump to:	C R

Jump to:	% - E M O S T

	Index Entry	Section

%
	`%eris-spec-version`:	Encoding and Decoding Content
	`%null-convergence-secret`:	Encoding

-
	`->eris-read-capability`:	Working with Read Capabilities

E
	`eris-block-cache-ref`:	Cache
	`eris-block-cache-ref-operation`:	Cache
	`eris-block-cache-stop`:	Cache
	`eris-block-cache?`:	Cache
	`eris-coap-block-reducer`:	CoAP
	`eris-coap-block-ref`:	CoAP
	`eris-decode`:	Decoding
	`eris-decoder-init`:	Decoder
	`eris-decoder-length`:	Decoder
	`eris-decoder-position`:	Decoder
	`eris-decoder-read!`:	Decoder
	`eris-decoder-seek`:	Decoder
	`eris-decoder?`:	Decoder
	`eris-encode`:	Encoding
	`eris-encoder`:	Encoder
	`eris-encoder-finalize`:	Encoder
	`eris-encoder-init`:	Encoder
	`eris-encoder-result`:	Encoder
	`eris-encoder-update`:	Encoder
	`eris-read-capability->bytevector`:	Working with Read Capabilities
	`eris-read-capability->string`:	Working with Read Capabilities
	`eris-read-capability-block-size`:	Working with Read Capabilities
	`eris-read-capability-level`:	Working with Read Capabilities
	`eris-read-capability-root-key`:	Working with Read Capabilities
	`eris-read-capability-root-reference`:	Working with Read Capabilities
	`eris-transduce`:	Decoding

M
	`make-eris-block-cache`:	Cache
	`make-eris-read-capability`:	Working with Read Capabilities

O
	`open-eris-input-port`:	Decoding
	`open-eris-output-port`:	Encoding

S
	`set-eris-encoder-result!`:	Encoder

T
	`teris-lookahead`:	Lookahead

Jump to:	% - E M O S T

Guile-ERIS Manual

Guile-ERIS

Table of Contents

1 Introduction

1.1 Contact

2 Overview and Concepts

2.1 Block Size

2.2 Convergent Encryption

2.3 Security Considerations

3 Encoding and Decoding Content

3.1 Encoding

3.1.1 Encoder

3.2 Decoding

3.2.1 Decoder

3.3 Working with Read Capabilities

4 Transport and Storage of Blocks

4.1 CoAP

4.2 Utilities

4.2.1 Cache

4.2.2 Lookahead

Index