InfluxData is arguably playing catch-up with Thanos, Cortex, and other scale-out Prometheus backends for the metrics use case. Given that, I wonder why they decided to write a new storage backend from scratch instead of building on the work Thano and Cortex have done. Those two competing projects are successfully sharing a lot of code that allows all data to be stored in object storage like S3.
Those two systems are designed to work with Prometheus style metrics, which are very specific. You have metrics, labels, float values, and millisecond epochs.
I'm not totally sure how they index things, but I would guess that it's by time series with an inverted index style mapping for the metric and label data to underlying time series. This means they'll have the same problems with working with high cardinality data that I outlined in the blog post.
InfluxDB aims to hit a broader audience over just metrics. We think the table model is great, particularly for event time series, which we want to be best in class for. A columnar database is better suited for analytics queries, and given the right structure is every bit as good for metrics queries.
I agree, they appear to be playing catch up on many fronts. Notably, with Cortex, Tempo, and Loki, Grafana Labs seem to have pulled way ahead in advancing a successful open-source cloud observability strategy.
InfluxData have a long history of writing (and rewriting) their own storage engines, so choosing to do it again is unsurprising. I guess this sort of hints that the current TSM/TSI have probably reached their performance and scalability limits and will be EOL before too long.
What I find interesting is that this project is already almost a year old and only has six contributors (two of whom look like external contractors). It seems more like a fun side project than the future core of the database that is supposed to be deployed into production next year.
I think the best new projects are created by a small focused team. Adding too many people too early actually slows things down. But, of course, I'm biased.
The thing about this getting to production next year is that we're doing it in our cloud, which is a services based system where we can bring all sorts of operational tooling to bear. Out of band backups, usage of cloud native services, shadow serving, red/green deploys, and all sorts of things. Basically, it's easier to deploy a service to production once you've built a suite of operational tools to make it possible to do reliably while testing under production workloads that don't actually face the customer.
As for us rewriting the core of the database, that's true. But I think you're unrealistic about what the data systems look like in closed source SaaS providers as they advance through orders of magnitude of scale. Hint: they rewrite and redo their systems.
As for Grafna, MetricsTank was their first, Cortex wasn't developed there, and Loki and Tempo look like interesting projects.
None of those things has the exact same goal as InfluxDB. And InfluxDB isn't meant to be open source DataDog. That's not our thing. We want to be a platform for building time series applications across many use cases, some of which might be modern observability. It also doesn't preclude you from pairing InfluxDB with those other tools.
My larger point is that, in contrast with the new project, InfluxDB currently has ~400 contributors. I'm certain that many dozens of those were involved in getting the current storage engine to a stable place. And now that hard work is on a path to being deprecated by moving to a completely new language and set of underlying technologies.
Taking the project from a handful of contributors to a production-ready technology within an existing ecosystem is a non-trivial task. I'm sure it will come together eventually, but the commitment to ship it "early next year" seems unlikely to me.
We'll be producing builds early next year. Those won't be anything we're recommending for production. Our goal is to have an early alpha in our own cloud environment by the end of Q1. I stress alpha. But we'll also have a bunch of tooling around it (which we've already built for the other parts of our cloud product) to backup data out of band, monitor it, shadow test it against production workloads, etc.
We're also building on top of the work of a bunch of others that have built Arrow and libraries within the Rust ecosystem.
When the open source GAs, I don't really know. But we're doing this out in the open so people can see, comment, and maybe even contribute. Who knows, maybe after a few years you'll be a convert ;)
https://grafana.com/blog/2020/07/29/how-blocks-storage-in-co...