Hey Graph’ers!
Are you ready for a special AMA with members of the core dev teams working on The Graph’s new roadmap - the New Era of The Graph? This isn’t just a new chapter for The Graph—it marks a transformative evolution in the world of web3, aiming to empower developers, boost the ecosystem, and redefine what’s possible with decentralized data.
📅 Join this special AMA from Tuesday, November 28, to Wednesday, November 30, 2023.
Because core dev team members span multiple time zones, responses to your questions will be staggered, offering a continuous and evolving dialogue over three days. Feel free to ask questions at your convenience and return often to see new answers and participate in the ongoing discussions. Community members and moderators from The Graph’s Reddit channel will be on hand to guide the AMA and ensure your questions are addressed.
Meet the AMA participants, all from the core dev teams at The Graph:
- Adam Fuller - Product Manager, Edge & Node
- Alex Bourget - Co-founder & CTO, StreamingFast
- Chris Wessels - Founder, GraphOps
- Daniel Keyes - Founder & CEO, Pinax
- Eva Beylin - Director, The Graph Foundation
- Sam Green - Co-founder & Head of Research, Semiotic Labs
- Uri Goldshtein - Founder, The Guild
- Vincent Wen - Engineering Manager, Messari
- Yaniv Tal - Founder & CEO, Geo
✨ The New Era promises a suite of new data services and features that are set to drive the next generation of decentralized applications. Additionally, from new tooling, features, updates, and upgrades, The Graph is empowering developers and ecosystem contributors. You can read the official announcement here and on Twitter.
The roadmap is structured around five core objectives:
- World of Data Services: Expanding beyond subgraphs to deliver a rich market of data services on the network (i.g., new query languages, LLMs, etc.)
- Developer Empowerment: Supporting developers through enhanced DevEx and tooling (e.g., Sunrise of Decentrzalized Data, upgrade Indexer, etc.)
- Protocol Evolution and Resiliency: Delivering a more resilient, flexible, and efficient protocol
- Optimized Indexer Performance: Boosting Indexer performance with improved tooling and operational capabilities
- Interconnected Graph of Data: Creating tools for composable data and organized knowledge graph
At the center of The Graph protocol is the power of community - so let’s hear your thoughts, feedback, and of course, answer any questions you may have about this New Era. Whether you’re curious about specific features, the roadmap’s objectives, or how you can get involved, the core devs are here to chat.
🌐 So, let’s dive in - ask the core devs anything about The New Era of The Graph!
Please note that this AMA will adhere to this channel’s Moderation & Administration Policy:
https://www.reddit.com/r/thegraph/comments/l0t81p/welcome_to_the_official_subreddit_for_the_graph/
Thanks for hosting this, and providing a glimpse of the future at datapalooza.
The separation of Extraction, transformation, Loading and Querying data seems to be key to accelerating the availability and flexibility of the data provided by the graph. Sam’s announcement of bringing clickhouse SQL to the graph really excites me, as I’m currently wasting a lot of time writing/maintaining code to make aggregations over data as part of the transform step, instead of aggregating at query-time.
What can we expect to see for the rollout of clickhouse SQL on the graph?
Since this is dependent on substreams, which in turn depend on firehose, what steps are needed to get substreams working on OP stack chains?
Will there be a way to get an “event substream” without call handlers shipped earlier than the full firehose implementation for OP stack chains, as this can be done with just an RPC instead of instrumenting OP-geth or OP-reth?
Thanks.
Hi u/Drewsapple! This is Sam from Semiotic Labs. Regarding your rollout question, here’s the current status:
* We currently have Substreams to ClickHouse working well
* We have recently prototyped the SQL API
* We have a sketch for how to handle DBT experiments by the developer
* The plan is to get SQL queries on the network by Q1 2024
* We are very interested in learning more about our developers’ specific use cases for SQL. Please dm me if you would be interested in chatting!
Pinax will answer your OP stack question :)
Hey, I’m Daniel Keyes, CEO of Pinax, and we’re very pleased to be here participating in this AMA.
Thanks for asking these great questions. For SQL data services, Pinax is currently investigating how to deploy these services in a performant, modular, and reliable way. We’ll work closely with StreamingFast and Semiotic to improve the workflow as operator of these services.
For Firehose, Pinax is working on adding RPC nodes for many EVM chains (if you want to see which ones, check the hosted service list of supported blockchains here: https://thegraph.com/docs/en/developing/supported-networks/). The StreamingFast team is working on a Firehose “light” stream that will not need to have deep instrumentation.
There will be some discussion on this topic in the Monthly Core Dev call this coming Thursday if you want to learn more. This page has info on how to access recordings of previous Core Dev calls and how to join in the future.
Hey apple :)
I’d love to learn more about your use case, and where you’re building aggregations in the transform step. Is that within Substreams? If so, we’re working hard to making this simpler and simpler. For instance, we’ll be:
substreams
CLI changelog, in theinit
command)That being said, we very much understand there’s a whole lot of things that are best done at query time. That’s why we’re putting lots of efforts on the SQL sink (https://github.com/streamingfast/substreams-sink-sql). It already has a high throughput injector, reorgs navigation - which we just released - for postgres, support for Clickhouse, and a bunch of other features.
This SQL sink is also what we’re turning into a deployable unit, shippable to The Graph network eventually. You have our first take at it here: https://substreams.streamingfast.io/tutorials/substreams-sql … but I think it’ll evolve quite a bit. The goal is that indexers can run those deployment endpoints, and even some gateways can accept deployment requests and decide where to optimally run workloads.
Our goal is to make it as easy as possible for you to think of a data service, pluck some from the community, and have them running on your behalf somewhere on The Graph network.
We’ve just recently closed this issue: https://github.com/streamingfast/substreams/issues/278 and we’ve rolled out that RPC poller for the Firehose Ethereum, that requires only an RPC node. The data is lighter, but we can get to much more chains much faster.
Using this method, we’ve backfilled the Arbitrum network (prior to the Nitro, called the “Classic” era). With this method, we’ll be sync’ing one chain after the other. We’re currently sync’ing Bitcoin Core (!) using this new method. OP is next on our list, but with a few instructions one could start using it right away. We’ve crafted a more precise definition of a Firehose extractor (you can read about it here: https://github.com/streamingfast/firehose-core/issues/17) and have implemented the RPC poller methods using this interface. Our goal is to speed up the chain coverage, by simplifying the extraction and not always require core chain instrumentation. Yet, if people want better performances (than the bit of latency induced by some RPC nodes), going deeper can be done post factum.
I think this addresses your last question too ^^.
Thanks for reaching out!
- Alexandre Bourget, CTO at StreamingFast.io