An Introduction To IPFS - ConsenSys - Medium
An Introduction To IPFS - ConsenSys - Medium
A blockchain venture production studio building decentralized applications on Ethereum. Go to
www.consensys.net and subscribe to our newsletter.
Feb 17, 2016 · 10 min read
An Introduction to IPFS
credit: Bogdan Burcea
“When you have IPFS, you can start looking at everything else in one
speci c way and you realize that you can replace it all” — Juan Benet
A Less Technical Approach to IPFS
from John Lilic
IPFS began as an e ort by Juan Benet to build a system that is very fast
at moving around versioned scienti c data. Versioning gives you the
ability to track how states of software change over time (think Git).
IPFS has since become thought of as the The Distributed, Permanent
Web ; “IPFS is a distributed le system that seeks to connect all computing
devices with the same system of les. In some ways, this is similar to the
original aims of the Web, but IPFS is actually more similar to a single
bittorrent swarm exchanging git objects. IPFS could become a new
major subsystem of the internet. If built right, it could complement or
replace HTTP. It could complement or replace even more. It sounds
crazy. It is crazy.”[1]
At its core, IPFS is a versioned le system that can take les and
manage them and also store them somewhere and then tracks versions
over time. IPFS also accounts for how those les move across the
network so it is also a distributed le system.
IPFS has rules as to how data and content move around on the network
that are similar in nature to bittorrent. This le system layer o ers very
interesting properties such as:
Content Addressing
IPFS uses content addressing at the HTTP layer. This is the practice of
saying instead of creating an identi er that addresses things by
location, we’re going to address it by some representation of the
content itself. This means that the content is going to determine the
address. The mechanism is to take a le, hash it cryptographically so
you end up with a very small and secure representation of the le
which ensures that someone can not just come up with another le that
has the same hash and use that as the address. The address of a le in
IPFS usually starts with a hash that identi es some root object and then
a path walking down. Instead of a server, you are talking to a speci c
object and then you are looking at a path within that object.
2. Go and nd it — when you have the hash then you ask the network
you’re connected to ‘who has this content? (hash)’ and you
connect to the corresponding nodes and download it.
The result is a peer to peer overlay that gives you very fast routing.
IPFS by Example
from Dr. Christian Lundkvist
IPFS Objects
IPFS is essentially a P2P system for retrieving and sharing IPFS objects.
An IPFS object is a data structure with two elds:
• Data — a blob of unstructured binary data of size < 256 kB.
The Size eld is mainly used for optimizing the P2P networking and
we’re going to mostly ignore it here, since conceptually it’s not needed
for the logical structure.
IPFS objects are normally referred to by their Base58 encoded hash. For
instance, let’s take a look at the IPFS object with hash
QmarHSr9aSNaPSR6G9KFPbuLV9aEqJfTk1y9B8pdwqK4Rq using the
IPFS command-line tool (please try this at home!):
{“Links”: [{
“Name”: “AnotherName”,
“Hash”: “QmVtYjNij3KeyGmcgg7yVXWskLaBtov3UYL9pgcGK3MCWu”,
“Size”: 18},
{“Name”: “SomeName”,
“Hash”: “QmbUSy8HCn8J4TMDRRdxCbK2uCCtkQyZtY6XYv3y7kLgDC”,
“Size”: 58}],
The data and named links gives the collection of IPFS objects the
structure of a Merkle DAG — DAG meaning Directed Acyclic Graph, and
Merkle to signify that this is a cryptographically authenticated data
structure that uses cryptographic hashes to address content. It is left as
an excercise to the reader to think about why it’s impossible to have
cycles in this graph.
File systems
IPFS can easily represent a le system consisting of les and directories
Small Files
A small le (< 256 kB) is represented by an IPFS object with data being
the le contents (plus a small header and footer) and no links, i.e. the
links array is empty. Note that the le name is not part of the IPFS
object, so two les with di erent names and the same content will have
the same IPFS object representation and hence the same hash.
added QmfM2r8seH2GiRaC4esTjeraXEachRt8ZsSeGaWTPLyMoG
test_dir/hello.txt
We can view the le contents of the above IPFS object using ipfs cat:
{“Links”: [],
added QmR45FmbVVrixReBwJkhEKde2qwHYaQzGxu4ZoDeswuF9w
test_dir/bigfile.js
{“Links”: [{
“Name”: “”,
“Hash”: “QmYSK2JyM3RyDyB52caZCTKFR3HKniEcMnNJYdk8DQ6KKB”,
“Size”: 262158},
{“Name”: “”,
“Hash”: “QmQeUqdjFmaxuJewStqCLUoKrR9khqb4Edw9TfRQQdfWz3”,
“Size”: 262158},
{“Name”: “”,
“Hash”: “Qma98bk1hjiRZDTmYmfiUXDj8hXXt7uGA5roU5mfUb3sVG”,
“Size”: 178947}],
Directory Structures
chris@chris-VBox:~/tmp$ ls -R test_dir
test_dir:
bigfile.js hello.txt my_dir
test_dir/my_dir:
my_file.txt testing.txt
The les hello.txt and my_ le.txt both contain the string Hello World!\n.
The le testing.txt contains the string Testing 123\n.
The IPFS command-line tool can seamlessly follow the directory link
names to traverse the le system:
IPFS can represent the data structures used by Git to allow for
versioned le systems. The Git commit objects are described in the Git
Book. The structure of IPFS Commit object is not fully speci ed at the
time of this writing, the discussion is ongoing.
The main properties of the Commit object is that it has one or more
links with names parent0, parent1 etc pointing to previous commits,
and one link with name object (this is called tree in Git) that points to
the le system structure referenced by that commit.
Blockchains
This is one of the most exciting use cases for IPFS. A blockchain has a
natural DAG structure in that past blocks are always linked by their
hash from later ones. More advanced blockchains like the Ethereum
blockchain also has an associated state database which has a Merkle-
Patricia tree structure that also can be emulated using IPFS objects.
We assume a simplistic model of a blockchain where each block
contains the following data: