I was benchmarking SQLite3 with Python on my MBP M1 and with default settings co...

jlokier · on April 13, 2023

I was going to say I expect much of the gain to come from "pragma synchronous=normal", which tells your MBP M1 to not bother committing the data to SSD robustly before telling the application that the insert is committed. From Sqlite docs:

> WAL mode is always consistent with synchronous=NORMAL, but WAL mode does lose durability. A transaction committed in WAL mode with synchronous=NORMAL might roll back following a power loss or system crash.

But then I realised "pragma fullfsync=off" is the default, which is a MacOS-specific pragma.

So, on MacOS only, WAL with synchronous=FULL (the default for that pragma) has the same durability issue as synchronous=NORMAL, that committed transactions might be rolled back following power loss or system crash, albeit with different probabilities.

deepsun · on April 13, 2023

Curious what the performance would be with "pragma fullfsync=on", as it is on other platforms.

infogulch · on April 13, 2023

If you really want to maximize sqlite inserts, prepare a statement with sqlite3_prepare_v3, then in your inner loop call sqlite3_bind_* + sqlite3_step. 100k easily. Unfortunately this interface is not exposed in most language bindings.

Groxx · on April 13, 2023

A fair number of ORMs will do this for you fwiw, but then you'll spend more CPU in the ORM stack so it is a bit of a tradeoff for micro-benchmarks like this.

bob1029 · on April 12, 2023

80K is a good figure in my experience.

If you think about this in latency terms, you are able to insert a row and be done with the entire ceremony in about 12 microseconds. This is serialized throughput too.

I think it is unlikely you would get this kind of performance with a hosted solution across the network. You could cheat with batching & caching, but for a cold, one-record non-query, nothing comes close to the latency of SQLite on a local NVMe device.

Maksadbek · on April 13, 2023

I tried the same test on Hetzner VPS (4VCPU AMD EPYC, 8 GB RAM, 160 GB NVMe SSD), and the result is about 20K inserts per sec. It's not 80K, but still exciting, I would even be happy with 10K. I am sure there is a big room for optimizations, but I am planning to use SQLite3 with Python in a very simple way.

Source code of Python test file: https://gist.github.com/maksadbek/2385b002b439e03dc948b05593...

Scarbutt · on April 13, 2023

Putting postgres and the app in the same server will shrink the latency and in the end it will win in throughput as your data and QPS increases.

klabb3 · on April 13, 2023

Postgres over localhost is ~10x slower (as in RTT) than SQLite, I think mainly due to serialization and perhaps context switches?

https://youtu.be/XcAYkriuQ1o

If you (more commonly) run on a different host, you’re looking more at 20x-50x and this is assuming you’re in the same region.

To even get close to the SQLite level of throughput you’d have to stagger/pipeline your requests over a number of concurrent connections (proportional to the earlier x).

You’ll eventually succeed at doing just that, with multiple machines. Congratulations, but now you have to also consider the N+1 problem, even for small Ns. In SQLite, you can afford much more back-and-forth with simple queries.

10000truths · on April 13, 2023

The benchmark in the video seems flawed:

* He’s comparing an in-memory SQLite DB against a non-temp table in Postgres. Of course the RTT is lower when you don’t have to interact with the filesystem or flush to disk in the critical path.

* He’s connecting to Postgres over a loopback interface, instead of using a UNIX socket. Of course the RTT is higher when requests and responses needlessly traverse the TCP/IP stack.

klabb3 · on April 13, 2023

> He’s comparing an in-memory SQLite DB against a non-temp table in Postgres.

That’d surprise me, where does he say that? (Even if he doesn’t it’s important to iron out though, so thanks for being critical!) The ~18us sound reasonable from my own experiments, and that’s with flush to file (using wal mode).

EDIT: I think I ran my benchmarks with synchronous=normal. Not sure what he did.

> He’s connecting to Postgres over a loopback interface, instead of using a UNIX socket.

I agree unix sockets would be better for same host setup, but still, the more representative number for “typical use” would be networked, no?

Also, for fairness: SQLite setups come with downtime, that’s hard to avoid. Worth mentioning..

10000truths · on April 13, 2023

> That’d surprise me, where does he say that? (Even if he doesn’t it’s important to iron out though, so thanks for being critical!) The ~18us sound reasonable from my own experiments, and that’s with flush to file (using wal mode).

I was looking at the benchmark results at 13:27, but now I see there's another set of benchmark results shown at 29:13. He doesn't link to a repo for that second one, but in-memory vs persistent doesn't matter as much when measuring reads if the results are cached in memory.

> I agree unix sockets would be better for same host setup, but still, the more representative number for "typical use" would be networked, no?

If we're going by what's "representative", then a benchmark isn't useful, because Postgres and SQLite have dissimilar use cases. If you want to compare their performance, it only makes sense to eliminate confounding variables, of which the TCP/IP stack is significant.

klabb3 · on April 13, 2023

> He doesn't link to a repo for that second one, but in-memory vs persistent doesn't matter as much when measuring reads if the results are cached in memory.

No doubt that we deserve reproducible benchmarks and this could use more details. As for caching that’d be the same for Postgres in theory.

> If we're going by what's "representative", then a benchmark isn't useful, because Postgres and SQLite have dissimilar use cases.

I agree and disagree! We already know Postgres is performant enough for a majority of server-side use cases. Ben is trying to show what you can expect from SQLite if you use it for those cases, which is novel to most people. SQLite doesn’t have the same amount of features, and partly because of that drawback, the extremely low latency (in a typical setup) is a redeeming factor. I think it’s not misleading to say SQLite has significantly lower app->db _latency_ than networked databases. Unix socket is a more fair microbench yea, but at least to me that’s not as useful because the Postgres setup wouldn’t be same-host.

ricardobeat · on April 13, 2023

Loopback interfaces in Linux bypass the network stack (and maybe in windows server as well?).

withinboredom · on April 13, 2023

I believe you still have TCP serialization (including packet fragmentation) in a loopback, but not in a unix socket. I could be wrong though.

fulafel · on April 13, 2023

The relative performance difference of no-op queries (RTT) is unlikely to have a high impact on application performance since they're so quick in absolute terms.

jbverschoor · on April 13, 2023

What about postgresql using a file socket

electric_mayhem · on April 13, 2023

Does SQLite do direct disk I/o or something?

I’m open to being convinced otherwise, but I would expect that inserts/s would vary a bunch by filesystem.

But I do agree that local nvme with any file system is absurdly better than what you’re likely to find in typical cloud envs.

alberth · on April 13, 2023

> Many programs use fopen(), fread(), and fwrite() to create and manage files of data in home-grown formats. SQLite works particularly well as a replacement for these ad hoc data files. Contrary to intuition, SQLite can be faster than the filesystem for reading and writing content to disk.

https://www.sqlite.org/whentouse.html

dundarious · on April 13, 2023

That quotation is about a SQLite file db being a better choice than multiple ad-hoc binary files. It’s not saying whether SQLite uses direct IO.

stefanos82 · on April 14, 2023

For a project a couple of years ago, I used Go language and what I did was a very simple thing: I took advantage its channel mechanism and created a slice to save (memoize more or less) 100K generated data which I inserted with a BEGIN TRANSACTION / END; in other words, every 100K I would insert them once and repeat.

If my memory serves me well, in approximately 6.8 seconds (give or take), I could insert 10M rows and bear in mind I didn't use WAL at all.

Not bad I would say, not bad at all.

pstuart · on April 13, 2023

Were those individual inserts or as a transaction? The latter should significantly bump those numbers up. Also consider doing a prepared statement while you're at it.

Maksadbek · on April 13, 2023

I assume it were individial inserts. https://docs.python.org/3/library/sqlite3.html#sqlite3.Curso...

Scarbutt · on April 13, 2023

What was the insert command?

Maksadbek · on April 13, 2023

It's just :)

    conn.execute("insert into cell_towers values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)", row)