What I really want: SQLite storage backend driver for s3/gcs. No need for disks then. I haven’t been able to find such a solution though; and am not technically proficient enough in C (the Lang SQLite is written in) to do so myself.
The challenge with this is that S3 and friends are object stores, meaning you upload or download the whole file each time. As you can imagine, this will cost you tremendous bandwidth even to insert one row.
Furthermore, it doesn't solve the multiple writers problem, because (afaik) there's no way to lock a file on S3.
It costs roughly $5/million PUTs and $0.40/million GETs on S3 in addition to the bandwidth and storage you use.
S3 objects are also immutable. Once they’re written, they can’t be updated.
A read-only version of this might be useful, but probably wouldn’t work in-place.
Something that might be of interest is S3 SELECT support
that lets you query a single (optionally compressed) CSV, JSON or Parquet file server-side at the same cost of a regular S3 GET.
And if you really want relational (i.e. JOINs, aggregations and sub-queries) semantics on a bucket full of CSV, JSON, Parquet, ORC or regular-expression-describable files in a cost effective way that has great performance on buckets containing 100’s of TBs of data, definitely look at Athena which is only $5/TB of data scanned during a query.
Redisql looks pretty sweet! To be frank, and this is only my personal opinion, I don't think I would want to pay for an api, but rather run my own. The business model of providing everything OSS but sending telemetry seems rather intriguing; as a hobbyist user I am ok with such telemetry being collected. If I were to run it in production for a business app though, I wouldn't even bother considering the unpaid version for the following reasons:
1. I do not want my production instance to shut down for WHATEVER reason. This is just not an acceptable risk for most businesses. The only time a DB can go down is something goes wrong.
2. As an engineer, I understand that 3 counters that are not accurate aren't a big deal. I can even look into the source and see that they really do as you say. Justifying this to a security org will be a complete nightmare, as most security orgs in enterprises are staffed with barely technical folks masquerading as "security".
So, it seems like a pretty good way to coerce enterprises to pay up while letting hobbyists continue using it. Very smart, I wish you the very best!
Benchmarks are always tricky, but sometimes useful, so yes I should post some of them.
Right now I am busy with releasing the v2, but after that I should definitely do some more marketing.
Anyhow, to give you an order of magnitude, on memory data storage we reach ~80k inserts for second. On a machine with 1vCPU and 3GB of RAM, a 15$/month box from DO.
that would be cool! as an interim step, if your data is small enough, perhaps you could run an in-memory sqlite db and periodically backup to a permanent s3 file?
APSW exposes the sqlite backup api so you could do them online without shutting down the database.