Postgres Unlogged Tables

tremon · on Feb 18, 2023

should someone literally pull the plug on your Postgres database server, your data is 100% safe, except for unlogged tables, in which your data is 100% gone - by design!

This omits a crucial fact: your data is not gone until Postgresql has gone through its startup recovery phase. If you really need to recover unlogged table data (to which every proper database administrator will rightly say "then WhyTH did you make it unlogged"), you should capture the table files before starting the database server again. And then pay through the nose for a data recovery specialist.

However, a crash will truncate it.

So this isn't exactly true. A crash recovery will truncate it.

anarazel · on Feb 18, 2023

Unlogged table data commonly won't even have been written to disk. And what is on disk is completely inconsistent after a crash. So really, the data is gone once postgres crashed (itself, due to hw issues, kill -9, ...)

apnew · on Feb 18, 2023

Interesting, I thought unlogged tables meant they dont get get WAL-ed and stored on FS at all.

Any further reading you can suggest?

anarazel · on Feb 18, 2023

They do have backing files - after all, unlogged tables can end up many times the size of memory. We need to reserve space for them, so there's a very high likelihood we could evict the buffers / shut down. But they're just written out due to buffer pressure or shutdown checkpoints (+things like slowing the tablespace).

anarazel · on Feb 18, 2023

s/slowing/altering/

Damn you autocorrect.

tremon · on Feb 18, 2023

A quick scan of the postgresql manual turns up nothing. I would have expected the steps performed during startup recovery to be documented there. Neither is the manual explicit about data file usage for an unlogged table, there's just these two snippets:

from https://www.postgresql.org/docs/15/sql-createtable.html#SQL-...:

> Data written to unlogged tables is not written to the write-ahead log (see Chapter 30), which makes them considerably faster than ordinary tables

and from https://www.postgresql.org/docs/15/glossary.html#GLOSSARY-UN...:

> The property of certain relations that the changes to them are not reflected in the WAL. This disables replication and crash recovery for these relations.

Which both say nothing about the normal data files underlying unlogged tables, so none of what I wrote can be found in the official docs (or maybe it can, just not by me ;)

However, there is also this page from the postgres developer wiki: https://wiki.postgresql.org/wiki/Future_of_storage which does say that pure in-memory tables are not supported by Postgres:

> it would be nice to have in-memory tables if they would perform faster. Somebody could object that PostgreSQL is on-disk database which shouldn't utilize in-memory storage. But users would be interested in such storage engine if it would give serious performance advantages.

ilyt · on Feb 19, 2023

The article says it is stored.

The documentation on feature says it is only truncated on crash.

Where did you read it is not saved on disk?

gavinray · on Feb 18, 2023

At a high level, there are some things that coordinate when transactions happen in terms of WAL logging:

1. The Buffer Pool (memory for pages in the database)

2. The Log Manager

3. The Transaction Manager

4. The Storage Manager (read/write to disk)

5. Accessor Methods (interpret page bytes as e.g. Heap or BTree)

This is abstract, not particular to Postgres, which doesn't have exact such names for all above things.

Normally, when the Transaction Manager creates transactions that modify records/tuples (using the Accessor Methods) these actions need to be persisted via the Log Manager to the WAL.

The pages of memory backing these records come from the Buffer Pool, and the Buffer Pool must also log certain actions.

Before the Buffer Pool can flush any modified page to disk, the changes up to that page must have been persisted by the WAL via the Storage Manager as well.

When you create unlogged tables, none of this happens, and when you modify records there's no trail.

There's an attempt at tl;dr'ing WAL

I am not an expert (Anarazel is)

_a_a_a_ · on Feb 18, 2023

I thought the article was pretty clear.

_a_a_a_ · on Feb 18, 2023

'Safe' here obviously means including transactional integrity, so I feel you're criticising the wrong thing.

> A crash recovery will truncate it.

That's so nitpicky it has lesser nitpicks living upon it.

Dylan16807 · on Feb 18, 2023

> 'Safe' here obviously means including transactional integrity, so I feel you're criticising the wrong thing.

Good news, they're not criticizing the word "safe", they're criticizing the word "gone".

> That's so nitpicky it has lesser nitpicks living upon it.

It says "100% gone" when you pull the plug.

Pointing out that is merely queued for deletion is a big deal, not a nitpick.

_a_a_a_ · on Feb 18, 2023

> Pointing out that is merely queued for deletion...

ok, so before deletion occurs, 1) how are you going to get it back and 2) how do you know how much you have/haven't lost (that transactional integrity I mentioned)?

Dylan16807 · on Feb 18, 2023

"capture the table files before starting the database server again. And then pay through the nose for a data recovery specialist."

How much? Probably most of it. The data is not safe. But it's also not gone.

_a_a_a_ · on Feb 19, 2023

I understand what you're saying and I think we're both right, except that we're looking from different angles: you're thinking of it as bits on a disc and I'm looking at it as if I were running my business on that database.

Edit: put another way, you: "Yep, I think we can get something back". Me: "aaaaaaaaaaaargh! mah payroll!"

Dylan16807 · on Feb 19, 2023

Suggesting you pay a lot of money for data recovery is also treating it as important business data.

In this scenario you already screwed up badly but you can salvage a lot. Very different from there being no hope.

klysm · on Feb 19, 2023

Is it likely that the data is corrupted?

BeefySwain · on Feb 18, 2023

Very well written and detailed article, with the caveat that they never mention a use case they consider legitimate. Does anyone here have any uses? I could imagine some sort of ETL type tasks which are transient could make sense. Thoughts?

hyperman1 · on Feb 18, 2023

I use them for importing from files to tables. Run some cleanup and validation, before moving the data to its final destination. If pg crashes halfway, I still have the file imported and restart from zero.

A second use is importing httpd combined logs and doing some analysis when finding out which chains of http calls caused some kinds of behaviour. When done, the tables get deleted. It allows easy ad hock queries, correlating with monitoring tables, indexing... I used to write some python scripts, but pg did better than expected, and this way of working stuck somehow.

riku_iki · on Feb 19, 2023

what is the advantage of using unlogged over temp tables for you?

hyperman1 · on Feb 19, 2023

I stumbled on them in the manual before I stumbled on temp tables ;-)

candiddevmike · on Feb 18, 2023

I use them for caching in lieu of redis, they work well and by using triggers I can ensure they're never stale

jpalomaki · on Feb 18, 2023

I've been using these for some larger analytical queries that run in batch fashion (typically something that's done like once or couple of times per day).

I find it hard to get good performance from large queries with CTEs. Often, it's much easier (and uglier) to just split them to multiple steps, create intermediate tables for each step and add necessary indexes.

Temporary tables are then of course even better, since usually you don't want to have these left around. But temporary tables are also unlogged.

avianlyric · on Feb 18, 2023

Which versions of Postgres have you been running your large CTE based queries with?

Up until Postgres 12, all CTE were materialised before their results were used, effectively making every CTE a temporary table. In Postgres 12 onwards, for side-effect free CTE that are only referenced once, Postgres will take constrains from the parent query, and push them into the CTE, reducing the amount of data they need process, and allowing better use of indexes.

OOI have you tried converting your CTEs into sub-queries as an experiment to see if they’re faster? Even in Postgres 12 onwards, sub-queries and CTEs are treated the same by the query planner, and you can get some surprising differences in performance.

RandomBK · on Feb 18, 2023

I can't speak to Parent's situation, but temp tables are common practice in OLAP workloads across many database engines (personal experience with Postgres, Presto, Spark, Netezza).

At a certain level of query complexity, no query planner is able to accurately predict the characteristics of sub-queries or CTE, resulting in query plans that are ill-suited to the problem. More often than not, materializing CTEs inside a large query into 2-3 temporary tables results in order-magnitude performance as the database now knows exactly how many rows its dealing with, stats on # of nulls, etc.

avianlyric · on Feb 19, 2023

Certainly don’t disagree, I’m well acquainted with the limits of the query planner. Especially when dealing with high complexity queries.

My broader point is that CTEs in Postgres are more nuanced than they appear at first blush. They’re frequently presented as simply a method of writing cleaner, more readable SQL. The fact that CTEs receive special treatment by the planner is often missed, along with the fact that two functionally identical queries, one using sub-queries, the other CTEs, can have wildly different performance characteristics.

anarazel · on Feb 18, 2023

Session state is a very common one, where the journaling overhead also can be particularly pronounced (much easier to hide in batch workloads).

ilyt · on Feb 19, 2023

Session store is obvious enough use case. Anything that's essentially a cache or copy of data existing somewhere else

ghiculescu · on Feb 19, 2023

Faster test suite, on a CI only database.

jonstewart · on Feb 18, 2023

Does COPY FROM sidestep the WAL? My (perhaps incorrect) understanding is that pg writes such data to a table file then uses a small transaction to make it live.

lfittl · on Feb 18, 2023

It does not, except if you're COPYing into an unlogged table - for regular tables data that gets COPYed in gets WAL logged and is crash safe.

In my understanding, as compared to a multi-row insert, a COPY statement is (1) a lot more efficient to parse/process, (2) uses a special bulk write strategy that avoids churn and large amount of dirty pages in shared buffers.

As compared to multiple single row inserts another benefit is that you're writing a single transaction ID, instead of a lot of individual transaction IDs (causing less freezing work having to be done by autovacuum, etc.)

anarazel · on Feb 18, 2023

We also batch the insertion into the table and the WAL logging, if possible. There's cases where that optimization has to be disabled though (presence of some types of triggers, for example).

chuckledog · on Feb 18, 2023

Seems like bulk loading might be a good use case here.

BFLpL0QNek · on Feb 19, 2023

I’ve used it for bulk loading on time series data partitioned by date.

It’s a few years ago so details are blurry. We got daily snapshots and on ingest we created an unlogged temporary table then when complete used INSERT FROM to the destination table if I remember right.

As long as the application handles failure on bulk ingest and you copy the data to a logged table or enable logging on the table once ingested it worked really well.

avianlyric · on Feb 18, 2023

Probably not, unless you don’t want to keep the data long term.

The initial load might be faster, but you’ll end up repaying that time saving when you turn on logging back on, and Postgres is then forced to read the contents of the table and push it into the WAL anyway. Thereby throwing away any saving you originally made by avoiding the WAL with an unlogged table.

ndriscoll · on Feb 18, 2023

It can be useful to bulk load data into a temporary table, do validations (which may involve joins to your permanent tables so it's convenient to be able to do in the database vs the application) there, and then once everything is validated, copy into your real tables from the temporary table. If you have a process like that, it could make sense to have the temporary table be unlogged (assuming the copy to real tables is still safe if you do that).

orlp · on Feb 19, 2023

I am not saying it does actually work this way, but there is no reason a bulk import needs to touch the WAL at all during import. As long as a lock was held the entire time during an import on the table, the import procedure can simply write to the WAL that this table is consistent as it is on-disk after flushing, flush the WAL, then release the lock on the table.

avianlyric · on Feb 19, 2023

No it can’t, because doing that would break live backups and replication, which both depend on capturing the WAL and assume it’s a total representation of all data managed by the DB. Any data that isn’t written to the WAL in its entirety can’t be replicated, and can’t be guaranteed to be backed up correctly.

If you’re willing to throw away that functionality, then maybe you could avoid writing the rows to the WAL completely (but it seems Postgres doesn’t support that). But personally I would always be a little suspect of data stored in Postgres that never transited the WAL. It’s a core part of how Postgres provides it durability guarantees, and lots of tools and functionality assume that everything worth protecting gets written to it at least once.

ellisv · on Feb 19, 2023

It’s more common to use TEMP tables but you can use UNLOGGED instead which can be helpful if you want to keep the same schema.

antupis · on Feb 18, 2023

We did use it ETL processing.

lazyant · on Feb 18, 2023

can we have "best of both worlds" for inserting a lot of data quickly, to have an unlogged table for performance and a trigger to copy data to another regular table for permanence?

usrusr · on Feb 19, 2023

I'd think so, if there is some asynchronous processing involved that requires lots of intermediate bouncing to disk: do the input queue on logged tables, the WIP states on unlogged ones and when they are done push to output queue (again logged) and remove from the input in the same transaction.

sebazzz · on Feb 19, 2023

So similar use cases as MSSQL in-memory tables?