Hello fellow developers and DApp enthusiasts,
I’m currently developing a decentralized application (DApp) that needs to manage very large files, often exceeding 2GB, on the client side within a web environment. I’ve encountered a significant challenge: most browsers have a limitation on handling lists or data structures that exceed 2GB in size.
This limitation poses a problem when generating Content Identifiers (CIDs) for these large files. Ideally, a CID should represent the entire file as a single entity, but the browser’s limitation necessitates processing the data in smaller chunks (each less than 2GB).
Here’s my concern: If I process the file in segments to overcome the browser’s limitation, I’m worried that the resulting CIDs for these segments won’t match the CID that would be generated if the file were processed as a whole. This discrepancy could potentially impact the file’s integration and recognition within the IPFS network.
Has anyone else encountered this issue? Are there strategies or workarounds for generating a consistent CID for very large files without splitting them into smaller chunks? I’m looking for solutions or insights that would allow the DApp to handle these large files efficiently while maintaining consistency in the CIDs generated.
Appreciate any advice or shared experiences!
IPFS already does that, it chunks data and the final CID is the hash of the root of the tree.
The maximum size for a block in bitswap is 2MiB (minimum max size to be compliant with bitswap 1.2.0 https://github.com/ipfs/specs/pull/269)