Data / Storage

Interplanetary File System Could Pave the Way for a Distributed, Permanent Web

16 Jun 2021 9:15am, by

For the last couple of decades, many of us have benefitted from the massive changes that the Internet has brought into our lives. We are now accustomed to constant access to information, the virtual community building made possible by social media, and the ease and convenience offered by e-commerce websites. All of this made is possible by the Hypertext Transfer Protocol (HTTP), which was launched back in 1989 to facilitate the sharing of information between researchers at CERN.

But online content has evolved a lot since those early days of Web 1.0, as we can call it now. Back then, webpages were mostly static, with little in terms of user-generated content or interaction. As things matured into Web 2.0, the landscape shifted to emphasize interoperability, and a participatory culture that can be seen in the large amounts of user-generated content that can be found on wikis, blogs, video and image sharing sites, social media platforms, as well as catering to the demands of streaming services. But the emergence of these bandwidth-heavy hypermedia — along with the potentially huge influx of data from the Internet of Things — is beginning to strain the internet, prompting some to look for alternatives.

IPFS: “Faster, Safer, and More Open”

One promising candidate for building the next-generation Web 3.o is the Interplanetary File System (IPFS), a relatively new hypermedia protocol and decentralized data storage system that leverages a peer-to-peer (p2p) network architecture.

IPFS’ sci-fi-worthy moniker is a nod to American computer scientist J.C.R. Licklider‘s musings about an “intergalactic computer network” in the 1960s. IPFS was first developed in 2014 by American computer scientist and founder of Protocol Labs Juan Benet, in order to address some of the shortcomings of HTTP. Benet’s aim was to create something that could eventually become a “new major subsystem of the internet,” while also taking into account newer developments like the distributed ledger technology that underpins blockchain.

“IPFS is a decentralized data network,” explained Protocol Labs’ engineering manager Mikeal Rogers. “Anyone in the world can make data available in the network, and anyone in the world can receive that data from them or from anyone else securely. IPFS was, and still is, developed as the data transfer protocol of Web 3.0. Since the protocol is entirely decentralized, and all data is addressed by hash, it’s a perfect fit for blockchain applications that need to work with large amounts of data they can’t put into the chain itself.”

The decentralized model that underlies IPFS stands in stark contrast to the client-server model that HTTP runs on. Originally designed for transferring information between web browsers and web servers, HTTP uses location-based addressing that allows users to access data stored on centralized servers. While this simplifies the management and distribution of data, it’s not very efficient. That’s because when you click on a website, your web browser has to connect directly to the server that is hosting that website. With larger audio and video files, it can use up a lot of bandwidth and even be quite costly, particularly if the origin server is located far away. Browsing or downloading popular content can also result in network congestion. There are also potential privacy and security problems with HTTP: the data can be accessed or altered by whoever has control of the server, or made inaccessible by distributed denial-of-service (DDoS) attacks by hackers.

By comparison, IPFS uses content-based addressing so that content can be verified and decoupled from distant servers, and stored closer to the user. It does that through the use of content identifiers (CIDs), or “labels” which are used to point to material in IPFS. CIDs are generated based on the content’s cryptographic hash, or a function that uses mathematical algorithms to take some arbitrary input and return a fixed-length value.

“When you put data in IPFS, it is available by its hash address or CID,” said Rogers. “Any person in the world can take that address, put it in their computer, and retrieve the data. Just like anyone in the world can put a URL into their browser and retrieve it, anyone with a CID can retrieve data available in the IPFS network.”

Any discrepancies in the content will result in a different CID, and the same content added to different IPFS nodes will still produce the same CID, meaning that users can easily verify the integrity of the data.

Besides providing decentralized content storage and verification, IPFS can be used for building and hosting decentralized apps (DApps), open-source computer applications that have their backend code running on peer-to-peer networks.

“With a CID, the hash of the data you’re looking for is in the address, so you can get that data from anywhere because you can validate the data is correct by comparing the hash,” added Rogers. “This is what allows you to have a decentralized trust-less network, since you can retrieve the data from anywhere and anyone, and you can cryptographically validate the data is correct.”

In addition, IPFS’s content addressing is more efficient than HTTP’s location-based addressing because data can be retrieved from the closest node in the network, rather than from a remote server. This configuration also means that even if one node goes down, the content in its entirety can still be retrieved from other nodes. Data persists, even if part of the network is down, as it may be during large-scale outages or censorship campaigns.

“The major difference with IPFS is that a CID can be retrieved from anyone in the world who wishes to make that data available, whereas with HTTP a website has to live at the specific location encoded in the URL,” Rogers pointed out.

To access or store content on the IPFS network, one has to install the software and run an IPFS node. To access IPFS over HTTP without installing the software, one uses gateways, such as the IPFS public gateway, or the Distributed Web Gateway that is managed by Cloudflare. Any of these gateways will allow users to retrieve content from anyone in the network.

Centralized, decentralized, and distributed network systems

There’s a DApp for That

Besides providing decentralized content storage and verification, IPFS can be used for building and hosting decentralized apps (DApps), which are open-source computer applications that have their backend code running on peer-to-peer networks. Some DApps have their own blockchain, like Bitcoin, while others run on existing blockchains and generate their own tokens. DApps can range from digital asset exchanges, games, and social media platforms.

“These are all examples of DApps that leverage off-chain data, and that’s what IPFS is great for,” noted Rogers. “Any time you want to refer to data in a decentralized system, like a DApp, you should use IPFS. You should probably use a gateway for reading that data in the web browser, as p2p protocols are still making their way into browsers. But references to data that you put into transactions in blockchains should always use IPFS addresses, so that you can look up that data from anywhere in any content addressed network, whether that’s IPFS or future networks, since the address doesn’t lock you into IPFS or any specific protocol.”

Another high-profile use-case of DApps is in the creation, distribution and storage of non-fungible tokens, or NFTs, which are essentially unique cryptographic assets that are stored on a blockchain. Like a priceless collectible in real life and unlike fiat currencies, the value of NFTs arises from the fact that they can not be exchanged or traded at equivalency, and their authenticity and ownership history (or provenance) is easily verifiable thanks to blockchain technology. NFTs can range from digital artworks, tweets, or even collectible characters (like CryptoKitties’ virtual cats). All of these digital collectibles need to be stored somewhere, and that’s where decentralized systems like IPFS come in.

“For NFTs specifically, we’ve gone even further and set up a service to store NFT data indefinitely for free at nft.storage,” added Rogers.

The New Trust-less Web

So could IPFS complement or even replace HTTP in the future? Perhaps. But it’s certain that the evolutionary trajectory of the Web will soon outstrip current protocols, if it hasn’t already. In the meantime, IPFS is still being refined as a growing number of users, developers and companies are adopting it.

“IPFS is general purpose, and has little in the way of storage limitations,” wrote Neocities founder Kyle Drake in a blog post announcing his company’s decision to become the first major website to implement IPFS. “It can serve files that are large or small. It automatically breaks up larger files into smaller chunks, allowing IPFS nodes to download (or stream) files from not just one server like with HTTP, but hundreds of them simultaneously. The IPFS network becomes a finely-grained, trust-less, distributed, easily federated content delivery network (CDN). This is useful for pretty much everything involving data: images, video streaming, distributed databases, entire operating systems, blockchains, backups of 8-inch floppy disks, and most important for us, static web sites.”

Ultimately, these core notions behind IPFS may be one solution to build a distributed, permanent web. It’s one possible alternative to the brittle and hypercentralized system that we’ve now arrived at with outdated protocols like HTTP — and potentially a useful hedge against an uncertain future.

“Part of our mission at Protocol Labs is building technology for the benefit of humanity in the long term,” said Rogers. “A large portion of human culture already takes place online — but in closed platforms like Instagram. Data primitives for decentralized media sharing — such as NFTs — could eventually replace closed platforms like Instagram. We think this data needs to be persisted indefinitely: it’s part of our recorded history as a species, and we’re happy to be in a position to provide long-term guaranteed persistence of human culture.”

More over at ipfs.io.

Images: IPFS & Pexels

A newsletter digest of the week’s most important stories & analyses.