HashBackup author here, been using SQLite for about 15 years.
Doing a vacuum after a large number of deletes might make sense. The only real purpose of vacuum IMO is to recover free space from a database. Vacuum may also optimize certain access patterns for a short while, though I have never tested this, and it would be highly dependent on the queries used. If fragmentation is a bigger concern for you than recovering free space, you can also compute the fragmentation to decide whether to vacuum by using the dbstat table:
Then again, computing this will require accessing most of the database pages I'm guessing, so might take nearly as long as a vacuum. The other gotcha here is that just because db pages appear to be sequential in a file doesn't mean they are sequential on a physical drive, though filesystems do strive for that.
SQLite has pragma commands to tell you the number of total and free db pages. When the percentage of free pages is greater than x% and it's a convenient time, do a vacuum. For a highly volatile db, you can add a table containing this percentage, update it every day, and make your decision based on an average, but IMO it's easier just to check for more than 50% free (or whatever) and do the vacuum.
Vacuums used to be (circa 2019) pretty slow operations, but the SQLite team has sped them up greatly since then. Vacuuming a 3GB SQLite db on a SSD takes less than a minute these days. That's with the db 100% full; with only 50% used pages, it would be considerably faster.
Vacuums are done in a statement transaction, so you don't have to worry about a "half vacuum that runs out of disk space" screwing up your database.
Doing a vacuum after a large number of deletes might make sense. The only real purpose of vacuum IMO is to recover free space from a database. Vacuum may also optimize certain access patterns for a short while, though I have never tested this, and it would be highly dependent on the queries used. If fragmentation is a bigger concern for you than recovering free space, you can also compute the fragmentation to decide whether to vacuum by using the dbstat table:
https://www.sqlite.org/dbstat.html
Then again, computing this will require accessing most of the database pages I'm guessing, so might take nearly as long as a vacuum. The other gotcha here is that just because db pages appear to be sequential in a file doesn't mean they are sequential on a physical drive, though filesystems do strive for that.
SQLite has pragma commands to tell you the number of total and free db pages. When the percentage of free pages is greater than x% and it's a convenient time, do a vacuum. For a highly volatile db, you can add a table containing this percentage, update it every day, and make your decision based on an average, but IMO it's easier just to check for more than 50% free (or whatever) and do the vacuum.
Vacuums used to be (circa 2019) pretty slow operations, but the SQLite team has sped them up greatly since then. Vacuuming a 3GB SQLite db on a SSD takes less than a minute these days. That's with the db 100% full; with only 50% used pages, it would be considerably faster.
Vacuums are done in a statement transaction, so you don't have to worry about a "half vacuum that runs out of disk space" screwing up your database.