How DecentraTech Can Reduce Costs for Artificial Intelligence – Part 2: Decentralized Storage11/25/2024 By Pete Harris, Co-Founder and Executive Director, DecentraTech Collective In the last blog, I discussed the considerable compute needs of artificial intelligence (AI) applications and how Decentralized Physical Infrastructure Networks (DePIN) can offer a solution. As it happens, DePIN can also help address another infrastructure bottleneck that often limits AI – access to and storage of vast amounts of trustworthy data. For generative AI models to produce accurate results – an essential capability to implement responsible and trustworthy AI requirements – they need to be trained. This activity teaches models how to perform a particular task and so they can identify patterns in data that is presented to them when run in real time. This training typically requires high quality data, and lots of it. And storage of that data can be challenging. Even though physical storage costs are continually falling, building storage infrastructure that is continuously available, high performance and secure is a significant investment. Moreover, given that AI models will likely want to have access to increasing quantities of data as they are developed, ongoing costs will almost certainly increase over time. Adopting a DePIN approach to address AI data storage capacity has the potential to accelerate scale up while reducing costs. As with AI compute, DePIN would tap into infrastructure that is provided by many entities, including startups, small/medium enterprises, communities, and individuals, which would participate in the storage pool in return for cryptocurrency-based rewards. In addition to reduced build out time and costs, the decentralized architecture of DePIN can offer other benefits compared to traditional centralized data centers, including tamper and censorship resistance (by storing hashes of data to determine whether it has been modified), improved resilience (by replicating data blocks across network nodes), and increased performance. Reduced power consumption is another likely benefit. Compared to other #DecentraTech projects, including DePIN compute farms, DePIN data storage is already well developed and established, with several open source and commercial offerings available and significant production use cases to learn from. Examples include Arweave, Codex, DeNet, Filecoin, and Storj. A number of decentralized storage offerings are based on a set of open-source Distributed Hash Table protocols known as the Interplanetary File System (IPFS). The project was started in 2014 by Juan Benet and his company Protocol Labs and the first (generally considered) useable implementation of the IPFS protocols was released as Kubo in April 2016. Both IPFS and Kubo have since been updated, and Kubo is cited by the project as the most popular implementation of IPFS in use today. Another implementation of IPFS that has been widely adopted for real world applications by businesses, the scientific community, activist groups, and socially oriented nonprofits is Filecoin. Launched in 2017 and now governed by the Filecoin Foundation, Filecoin adds a “Proof of Storage” incentive function to the core IPFS protocols to reward those providing the physical storage to the network. As of August 2024, the total capacity of the Filecoin network was 23 exbibytes across some 40 countries, with 2 exbibytes of user data being stored by around 2,000 entities (one exbibyte = 1,152,921,504,606,846,976 bytes, roughly equivalent to 1 quadrillion pages of plain text type). Some of the higher profile users of Filecoin include NASA, the US Geological Survey and the National Institutes of Health. More than 500 entities have datasets of more than 1,000 tebibytes, while the Internet Archive’s Democracy Library stores more than a pebibyte of open government data. Proponents of Filecoin cite its ability to store vast quantities of data, provide proof that it has been stored securely and not altered, and its decentralized architecture that makes it resistant to tampering, as ideal attribute for responsible AI workloads, where an audit trail of data inputs and models provides provenance and promotes trust in the outputs of models. Recognizing that it offers such audibility and security benefits, Filecoin has recently announced several partnerships related to expanding its use for AI applications. For example, SingularityNET is focusing on securing metadata for verifiable model training, while Eternal AI and EQTY Lab is tapping Filecoin to validate model lineage.
0 Comments
Your comment will be posted after it is approved.
Leave a Reply. |
About the CURATORPete Harris is the Co-Founder and Executive Director of the DecentraTech Collective. He is also Principal of Lighthouse Partners, which provides business strategy services to developers of transformational technologies. He has 40+ years of business and technology experience, focusing in recent years on business applications of blockchain and Web3 technologies. Curated AND FOCUSED CONTENTThe Collective recognizes that there is a wealth of information available from publications, newsletters, blogs, and more. So we are curating 'must read' content in specific areas to help our community to cut through the noise and focus only on what’s important. Archives
November 2024
|