How DecentraTech Can Reduce Costs for Artificial Intelligence – Part 2: Decentralized Storage11/25/2024 By Pete Harris, Co-Founder and Executive Director, DecentraTech Collective In the last blog, I discussed the considerable compute needs of artificial intelligence (AI) applications and how Decentralized Physical Infrastructure Networks (DePIN) can offer a solution. As it happens, DePIN can also help address another infrastructure bottleneck that often limits AI – access to and storage of vast amounts of trustworthy data. For generative AI models to produce accurate results – an essential capability to implement responsible and trustworthy AI requirements – they need to be trained. This activity teaches models how to perform a particular task and so they can identify patterns in data that is presented to them when run in real time. This training typically requires high quality data, and lots of it. And storage of that data can be challenging. Even though physical storage costs are continually falling, building storage infrastructure that is continuously available, high performance and secure is a significant investment. Moreover, given that AI models will likely want to have access to increasing quantities of data as they are developed, ongoing costs will almost certainly increase over time. Adopting a DePIN approach to address AI data storage capacity has the potential to accelerate scale up while reducing costs. As with AI compute, DePIN would tap into infrastructure that is provided by many entities, including startups, small/medium enterprises, communities, and individuals, which would participate in the storage pool in return for cryptocurrency-based rewards. In addition to reduced build out time and costs, the decentralized architecture of DePIN can offer other benefits compared to traditional centralized data centers, including tamper and censorship resistance (by storing hashes of data to determine whether it has been modified), improved resilience (by replicating data blocks across network nodes), and increased performance. Reduced power consumption is another likely benefit. Compared to other #DecentraTech projects, including DePIN compute farms, DePIN data storage is already well developed and established, with several open source and commercial offerings available and significant production use cases to learn from. Examples include Arweave, Codex, DeNet, Filecoin, and Storj. A number of decentralized storage offerings are based on a set of open-source Distributed Hash Table protocols known as the Interplanetary File System (IPFS). The project was started in 2014 by Juan Benet and his company Protocol Labs and the first (generally considered) useable implementation of the IPFS protocols was released as Kubo in April 2016. Both IPFS and Kubo have since been updated, and Kubo is cited by the project as the most popular implementation of IPFS in use today. Another implementation of IPFS that has been widely adopted for real world applications by businesses, the scientific community, activist groups, and socially oriented nonprofits is Filecoin. Launched in 2017 and now governed by the Filecoin Foundation, Filecoin adds a “Proof of Storage” incentive function to the core IPFS protocols to reward those providing the physical storage to the network. As of August 2024, the total capacity of the Filecoin network was 23 exbibytes across some 40 countries, with 2 exbibytes of user data being stored by around 2,000 entities (one exbibyte = 1,152,921,504,606,846,976 bytes, roughly equivalent to 1 quadrillion pages of plain text type). Some of the higher profile users of Filecoin include NASA, the US Geological Survey and the National Institutes of Health. More than 500 entities have datasets of more than 1,000 tebibytes, while the Internet Archive’s Democracy Library stores more than a pebibyte of open government data. Proponents of Filecoin cite its ability to store vast quantities of data, provide proof that it has been stored securely and not altered, and its decentralized architecture that makes it resistant to tampering, as ideal attribute for responsible AI workloads, where an audit trail of data inputs and models provides provenance and promotes trust in the outputs of models. Recognizing that it offers such audibility and security benefits, Filecoin has recently announced several partnerships related to expanding its use for AI applications. For example, SingularityNET is focusing on securing metadata for verifiable model training, while Eternal AI and EQTY Lab is tapping Filecoin to validate model lineage.
0 Comments
How DecentraTech Can Reduce Costs for Artificial Intelligence – Part 1: Decentralized Compute11/4/2024 By Pete Harris, Co-Founder and Executive Director, DecentraTech Collective It’s no secret that the popularity of artificial intelligence (AI) has exploded for both business and consumer applications. Along with that trend, public cloud platforms from Amazon Web Services, Microsoft, and Google have upgraded with the latest GPU chips to power AI apps.
While not the only enabler of success, the winners in the world of AI will likely be the companies that can amass substantial compute power to underpin their services.But for now there are a couple of issues with that endeavor. Firstly, the demand for the most powerful GPU chips is outstripping supply. And those chips are pricey. For example, the most powerful H100 chips from GPU leader Nvidia sell for around $40,000, while renting time on a GPU at a cloud provider costs about 100 bucks an hour. No wonder then that some AI startups are raising billions of dollars in funding simply to pay for the compute infrastructure that they will need to operate. One example is Coreweave, a specialist AI cloud provider, which recently closed on $7.5 billion in debt financing to double its datacenter capacity. Meanwhile, leading crypto VC firm Andreessen Horowitz is building GPU server clusters for companies to use. It expects to host some 20,000 GPUs as part of this initiative, known as oxygen, which it hopes to use as a competitive tool to lure startups to its portfolio. Other companies, including Elon Musk's xAI and Meta are rolling out AI clusters with 100,000 GPUs. Clearly, not many startups (or even enterprises for that matter) have the funding to be able to throw masses of GPUs at their AI endeavors. Which is where DecentraTech approaches – specifically leveraging Decentralized Physical Infrastructure Networks (or DePIN) – might present a path forward. Unlike traditional, centralized datacenters, which are typically built by single companies, such as Microsoft (which by the way expects to spend $50 billion on new datacenters for AI), DePINs leverage blockchain technology to decentralize control, ownership and the cost of building and maintaining physical infrastructure. In a DePIN model, this infrastructure is provided by large numbers of entities, including startups, small/medium enterprises, communities and individuals which make it available (often part of the time or as a background task when it is underutilized for its primary use) to the DePIN operator. For compute services, this generally requires the providers to install or run a software process on their workstation or server to register and manage their participation. In return, the providers are rewarded, generally in an operator-specific cryptocurrency. Note: for more on DePINs in general, see this blog from Multicoin Capital. Assuming enough providers can be appropriately incentivized to offer their infrastructure to the operator, and the DePIN design offers easy integration for providers and standards-based access for end-user applications, the DePIN approach can provide massive quantities of compute power, at a fraction of the cost of traditional cloud services. While DePIN compute offerings began offering generic CPU power, the rise of AI applications has led a number of DePIN operators to offer GPUs as well, while others specialize in GPU compute specifically for AI. See below for some examples of AI-oriented DePIN operators. Akash – established way back on 2015 and with plenty of experience in decentralized compute, last year it began to roll out various flavors of GPU. Its AKT token is built on Cosmos, a blockchain ecosystem with a mission to establish the 'Internet of Blockchains' by enabling secure communication and interoperability amongst various blockchains. Influx Technologies (Flux) – a decentralized infrastructure provider that began life in 2020. It’s recent FluxEdge offering provides access to a range of GPUs and is targeted at AI applications. GAIMIN – is actually positioned as a gaming network that rewards its users to play games in return for tapping into the GPUs in their PCs to power applications, including for AI. IO Research – a decentralized GPU network aligned to the Solana blockchain, originally focused on financial trading but which refocused on the AI space. The Render Network – focused on rendering of 3D graphics for the entertainment industry, Render allows participants in its GPU network to make available unused compute for AI applications. Decentralized compute is not the only way that DePINs can reduce costs for AI. As well as compute power, AI models generally need to have access to large volumes of data for training. So DePIN for decentralized storage of that data is potentially of interest too. That technology will be covered in Part 2 of this blog. |
About the CURATORPete Harris is the Co-Founder and Executive Director of the DecentraTech Collective. He is also Principal of Lighthouse Partners, which provides business strategy services to developers of transformational technologies. He has 40+ years of business and technology experience, focusing in recent years on business applications of blockchain and Web3 technologies. Curated AND FOCUSED CONTENTThe Collective recognizes that there is a wealth of information available from publications, newsletters, blogs, and more. So we are curating 'must read' content in specific areas to help our community to cut through the noise and focus only on what’s important. Archives
November 2024
|