zfs configuration
I recommend to start with the following configuration and tune it as you learn more:
recordsize=128k - same as default. compression=lz4 - enables lz4 compression. atime=off - disables access time update. xattr=sa - better extended attributes. logbias=latency - same as default. redundant_metadata=most - may improve random writes.
https://bun.uptrace.dev/postgres/tuning-zfs-aws-ebs.html#zfs-config
atime は off に
UNIX の伝統的なファイルシステムには atime というアクセスされた時刻を保持するフィールドがあります。 ZFS もこの機能をサポートしておりデフォルトでは on に設定されています。 しかし、Copy On Write を基本とする ZFS と atime は非常に相性が悪く、性能劣化を引き起こす要因になります。 なぜならば読み込み操作を行うたびに atime の更新という書き込み処理が発生し、全ての書き込みは Copy On Write で一度コピーが作成されるため、読み込み操作の数に比例して作業用にディスク容量を必要とすることになります。 性能面、資源面ともにデメリットとなりますので、off にしましょう。
なお、FreeBSD のインストーラから ZFS root でインストールした場合、インストーラが zroot の atime を off にします。 そのため、後から自分で追加したプールに対して設定が必要となります。
https://freebsd.seirios.org/doku.php?id=os:zfs_tips#postgresql
postgres conf
https://bun.uptrace.dev/postgres/tuning-zfs-aws-ebs.html
PostgreSQLに設定されているパラメタはSHOW ALLコマンドを使う事で確認します。
https://changineer.info/server/postgresql/postgresql_modify_parameter.html
full_page_writes = off
Please do not copy these settings blindly because I am myself not clear on why/how these settings had the impact they did. For example, I cannot explain why full_page_writes=off independently did not give that much boost, nor did an optimized PG configuration. However, putting both of them together gave a 2-4x boost compared to baseline numbers.
https://gist.github.com/saurabhnanda/5258207935bf23cd112be292d22f00d5#big-fat-warning
psql -c "SHOW ALL" | grep full_page_writes full_page_writes | on | Writes full pages to WAL when first modified after a checkpoint. postgresql.conf full_page_writes = off に変更して pg_ctl reload psql -c "SHOW ALL" | grep full_page_writes full_page_writes | off | Writes full pages to WAL when first modified after a checkpoint.
wal_compression = off
Sean advocates small block sizes and turning off all error checking in the database, but I don't expect you to re-init your database to make the latter happen. For a typical database workload with lots of small writes, having "recordsize=16K" for the zfs volume in question might be quite helpful. For my application, I have found that it doesn't help much, nor does it hurt other than it hinders compression some.
The big helper you can do now is that you can safely set "full_page_writes = off" & "wal_compression = off" for the WAL (usually about a 30-50% bump in write speed) as ZFS's copy-on-write scheme makes corruption impossible, and lz4 is better than what Postgres uses internally.
The other big helper is having a nice, fast ZIL SLOG.
SSD effective_io_concurrency
If you using solid-state drives, consider tweaking the following settings:
# Cost of a randomaly fetched disk page. # SSDs have low random reads cost relative to sequential reads. random_page_cost = 1.1 # Number of simultaneous requests that can be handled efficiently by the disk subsystem. # SSDs can handle more concurrent requests. effective_io_concurrency = 200https://bun.uptrace.dev/postgres/performance-tuning.html#ssd
shared_buffers と ZFS ARC
メモリの割り当て
結果
ZFSのキャッシュをした方が良いという結果になりました。
日記: PostgreSQLのshared_buffers設定とZFSのARCの関係を調べた 2019
zfs postgres pgroonga
メモリ