PostgreSQL9.5ストリーミングレプリケーションメモ

最近PostgreSQLに触れる機会が増え学習中です。

PostgreSQLでは、バージョン9から非同期レプリケーション機能が組み込まれ、以降同期レプリケーション、カスケードレプリケーションと進化し、バージョン9.6でマルチ同期レプリケーション^[1]9.5までは、プライマリサーバに対して同期スタンバイは１サーバのみであったが、9.6からは複数ノードで同期レプリケーションが可能になったが可能になったそうです。

最新バージョン9.6のマルチ同期レプリケーション機能も試してみたいところですが、まずは9.5までのストリーミングレプリケーションに関して備忘録を兼ねたメモです。

PostgreSQL9.5で可能なレプリケーション構成
1. 同期スタンバイサーバは１つのみ
  1. 同期スタンバイがダウンした場合
2. カスケード接続
プライマリへの昇格
同期スタンバイの切り離し
参考リンク

PostgreSQL9.5で可能なレプリケーション構成

同期スタンバイサーバは１つのみ

バージョン9.5では、同期スタンバイを組めるのは１サーバのみ^[2]http://www.postgresql.jp/document/9.5/html/runtime-config-replication.html#guc-synchronous-standby-namesとなっています。synchronous_standby_namesという設定値に同期スタンバイサーバを指定することで、同期モードで動作するスタンバイサーバを指定することができますが、カンマ区切りで複数のノードを指定した場合は、最初に指定されたサーバが同期スタンバイとなり、残りのサーバは同期候補になります。

### primary postgresql.conf ###
max_wal_senders = 3
synchronous_standby_names = 'standby1,standby2'
wal_level = hot_standby
### standby1 postgresql.conf ###
hot_standby = on
### standby1 recovery.conf ###
recovery_target_timeline=latest
primary_conninfo='host=localhost port=5432 application_name=standby1'
standby_mode=on
### standby2 postgresql.conf ###
hot_standby = on
### standby2 recovery.conf ###
recovery_target_timeline=latest
primary_conninfo='host=localhost port=5432 application_name=standby2'
standby_mode=on

### primary postgresql.conf ###

max_wal_senders = 3

synchronous_standby_names = 'standby1,standby2'

wal_level = hot_standby

### standby1 postgresql.conf ###

hot_standby = on

### standby1 recovery.conf ###

recovery_target_timeline=latest

primary_conninfo='host=localhost port=5432 application_name=standby1'

standby_mode=on

### standby2 postgresql.conf ###

hot_standby = on

### standby2 recovery.conf ###

recovery_target_timeline=latest

primary_conninfo='host=localhost port=5432 application_name=standby2'

standby_mode=on

上記はローカルマシン上で異なるportを使って複数のpostgresを起動することを想定した場合の最小限の設定例です。

上記設定の場合、standby1が同期スタンバイになります。pg_stat_replicationビューを参照することでレプリケーションの状態を知ることができます。同期スタンバイでないサーバのsync_stateは、potential（同期候補）となります。

postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 69010
usesysid         | 10
usename          | postgres
application_name | standby1
client_addr      | ::1
client_hostname  | 
client_port      | 51974
backend_start    | 2017-02-03 11:32:28.499849+00
backend_xmin     | 
state            | streaming
sent_location    | 0/5000000
write_location   | 0/5000000
flush_location   | 0/5000000
replay_location  | 0/4000108
sync_priority    | 1
sync_state       | sync
-[ RECORD 2 ]----+------------------------------
pid              | 69149
usesysid         | 10
usename          | postgres
application_name | standby2
client_addr      | ::1
client_hostname  | 
client_port      | 51976
backend_start    | 2017-02-03 11:33:30.376528+00
backend_xmin     | 
state            | streaming
sent_location    | 0/5000000
write_location   | 0/5000000
flush_location   | 0/5000000
replay_location  | 0/5000000
sync_priority    | 2
sync_state       | potential

postgres=# select * from pg_stat_replication;

-[ RECORD 1 ]----+------------------------------

pid | 69010

usesysid | 10

usename | postgres

application_name | standby1

client_addr | ::1

client_hostname |

client_port | 51974

backend_start | 2017-02-03 11:32:28.499849+00

backend_xmin |

state | streaming

sent_location | 0/5000000

write_location | 0/5000000

flush_location | 0/5000000

replay_location | 0/4000108

sync_priority | 1

sync_state | sync

-[ RECORD 2 ]----+------------------------------

pid | 69149

usesysid | 10

usename | postgres

application_name | standby2

client_addr | ::1

client_hostname |

client_port | 51976

backend_start | 2017-02-03 11:33:30.376528+00

backend_xmin |

state | streaming

sent_location | 0/5000000

write_location | 0/5000000

flush_location | 0/5000000

replay_location | 0/5000000

sync_priority | 2

sync_state | potential

同期スタンバイがダウンした場合

potentialのスタンバイサーバがいる状態で同期スタンバイサーバがダウンした場合、potentialサーバが同期スタンバイに昇格します。試しに上記構成で、standby1をstop -m immediateしてみます。

pg_ctl stop -m immediate

1	pg_ctl stop -m immediate

こうすると、pg_stat_replicationは以下のようになります。standby2のsync_stateがsyncになっているのがわかります。

postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 69149
usesysid         | 10
usename          | postgres
application_name | standby2
client_addr      | ::1
client_hostname  | 
client_port      | 51976
backend_start    | 2017-02-03 11:33:30.376528+00
backend_xmin     | 
state            | streaming
sent_location    | 0/5000060
write_location   | 0/5000060
flush_location   | 0/5000060
replay_location  | 0/5000060
sync_priority    | 2
sync_state       | sync

postgres=# select * from pg_stat_replication;

-[ RECORD 1 ]----+------------------------------

pid | 69149

usesysid | 10

usename | postgres

application_name | standby2

client_addr | ::1

client_hostname |

client_port | 51976

backend_start | 2017-02-03 11:33:30.376528+00

backend_xmin |

state | streaming

sent_location | 0/5000060

write_location | 0/5000060

flush_location | 0/5000060

replay_location | 0/5000060

sync_priority | 2

sync_state | sync

続いてダウンしたスタンバイサーバ（standby1）を、設定はそのままでクラスタに再度参加させると、追加したスタンバイの方が優先度が高いため同期スタンバイとなり、追加前に同期スタンバイであったサーバ（standby2）は、potentialになります。

ostgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 69400
usesysid         | 10
usename          | postgres
application_name | standby1
client_addr      | ::1
client_hostname  | 
client_port      | 51980
backend_start    | 2017-02-03 11:36:37.069404+00
backend_xmin     | 
state            | streaming
sent_location    | 0/5000060
write_location   | 0/5000060
flush_location   | 0/5000060
replay_location  | 0/5000060
sync_priority    | 1
sync_state       | sync
-[ RECORD 2 ]----+------------------------------
pid              | 69149
usesysid         | 10
usename          | postgres
application_name | standby2
client_addr      | ::1
client_hostname  | 
client_port      | 51976
backend_start    | 2017-02-03 11:33:30.376528+00
backend_xmin     | 
state            | streaming
sent_location    | 0/5000060
write_location   | 0/5000060
flush_location   | 0/5000060
replay_location  | 0/5000060
sync_priority    | 2
sync_state       | potential

ostgres=# select * from pg_stat_replication;

-[ RECORD 1 ]----+------------------------------

pid | 69400

usesysid | 10

usename | postgres

application_name | standby1

client_addr | ::1

client_hostname |

client_port | 51980

backend_start | 2017-02-03 11:36:37.069404+00

backend_xmin |

state | streaming

sent_location | 0/5000060

write_location | 0/5000060

flush_location | 0/5000060

replay_location | 0/5000060

sync_priority | 1

sync_state | sync

-[ RECORD 2 ]----+------------------------------

pid | 69149

usesysid | 10

usename | postgres

application_name | standby2

client_addr | ::1

client_hostname |

client_port | 51976

backend_start | 2017-02-03 11:33:30.376528+00

backend_xmin |

state | streaming

sent_location | 0/5000060

write_location | 0/5000060

flush_location | 0/5000060

replay_location | 0/5000060

sync_priority | 2

sync_state | potential

カスケード接続

PostgreSQLのレプリケーションでは、スタンバイにスタンバイを繋げるカスケード接続も可能です。設定は非常にシンプルで、recovery.confのprimary_conninfoに接続先のスタンバイサーバを指定するだけです。（pg_hba.confでの接続許可やmax_wal_sendersなどの設定は必要です）

### standby2 postgresql.conf ###
max_wal_senders = 3
wal_level = hot_standby
hot_standby = on
### standby3 postgresql.conf ###
hot_standby = on
### standby3 recovery.conf ###
recovery_target_timeline=latest
# standby2は5433portで起動しているものとする
primary_conninfo='host=localhost port=5433 application_name=standby3'
standby_mode=on

### standby2 postgresql.conf ###

max_wal_senders = 3

wal_level = hot_standby

hot_standby = on

### standby3 postgresql.conf ###

hot_standby = on

### standby3 recovery.conf ###

recovery_target_timeline=latest

# standby2は5433portで起動しているものとする

primary_conninfo='host=localhost port=5433 application_name=standby3'

standby_mode=on

カスケード接続されるスタンバイサーバは非同期レプリケーションとなります（standby2のsynchronous_standby_namesにstandby3を指定しても同期にはならない）。

プライマリへの昇格

プライマリサーバに何らかの異常が発生し、スタンバイサーバを新しくプライマリサーバとして稼働させたい場合は、スタンバイサーバ上でpg_ctl promoteコマンドを実行します（アプリケーション側は何らかの切り替え処理等が必要）。これにより、スタンバイサーバはプライマリサーバに昇格し、新プライマリサーバで更新処理を実行することができるようになります。PostgreSQL自体に自動フェイルオーバ機能はないので，スタンバイからプライマリへの切り替えを自動で行ないたい場合は別のツールを組み合わせて使う必要があります。

また，pg_ctl promoteを実行するとタイムラインIDが新しくなります。PostgreSQLでは，「タイムライン」^[3]https://www.postgresql.jp/document/9.5/html/continuous-archiving.htmlという概念でリカバリ前後におけるデータベースシステムを区別するようになっています。上記のプライマリ昇格では，pg_ctl promoteを実行する前にタイムラインIDが１であった場合，promote後にはインクリメントされて２のようタイムラインIDが変わります。

以下，pg_ctl promoteを実行したサーバのpg_xlogディレクトリです。WALのファイル名の先頭部分「00000001」がタイムラインIDを示していますが，promote前後で「00000001」から「00000002」に変わっていることが確認できます。

% ll db2/pg_xlog
total 65540
-rw------- 1 postgres 16777216  2  5 19:04 000000010000000000000003
-rw------- 1 postgres 16777216  2  5 19:04 000000010000000000000004
-rw------- 1 postgres       41  2  5 19:08 00000002.history
-rw------- 1 postgres 16777216  2  5 19:08 000000020000000000000004
-rw------- 1 postgres 16777216  2  5 19:04 000000020000000000000005
drwx------ 3 postgres      102  2  5 19:08 archive_status

% ll db2/pg_xlog

total 65540

-rw------- 1 postgres 16777216 2 5 19:04 000000010000000000000003

-rw------- 1 postgres 16777216 2 5 19:04 000000010000000000000004

-rw------- 1 postgres 41 2 5 19:08 00000002.history

-rw------- 1 postgres 16777216 2 5 19:08 000000020000000000000004

-rw------- 1 postgres 16777216 2 5 19:04 000000020000000000000005

drwx------ 3 postgres 102 2 5 19:08 archive_status

同期スタンバイの切り離し

プライマリサーバと同期スタンバイでレプリケーションを構成していて，同期スタンバイサーバが何らかの異常で停止した場合，プライマリサーバで更新処理ができなくなってしまいます。この場合，プライマリサーバの設定ファイルを書き換え，異常のあった同期スタンバイサーバを切り離す必要があります。

#synchronous_standby_names='standby3'

1	#synchronous_standby_names='standby3'

上記の同期設定をコメントアウトし，pg_ctl reloadします。これで，引き続きプライマリサーバで更新処理を継続できるようになります。

レプリケーションの動きを確認するのに、以下の拙作なツールが役立つかもしれません

GitHub - moritetu/pgenv2: pgenv2 is a tool to help you to manage multiple PostgreSQL versions.

pgenv2 is a tool to help you to manage multiple PostgreSQL versions. - moritetu/pgenv2

参考リンク

PostgreSQL全機能バイブル

新品価格
￥3,780から
(2017/2/5 20:14時点)

脚注[+]

脚注
↑1	9.5までは、プライマリサーバに対して同期スタンバイは１サーバのみであったが、9.6からは複数ノードで同期レプリケーションが可能になった
↑2	http://www.postgresql.jp/document/9.5/html/runtime-config-replication.html#guc-synchronous-standby-names
↑3	https://www.postgresql.jp/document/9.5/html/continuous-archiving.html