Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --exclude-snapshots option, which makes exclude pg_logical/snapshots #272

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

harukat
Copy link

@harukat harukat commented Dec 18, 2024

If there are many large file size temporary files or logical replication snapshot files in the database cluster directory, it will take longer to get a full backup. These files are not likely to be used after the restore. Therefore, I have added an option to exclude these files when obtaining full backups.

Since pg_basebackup always excludes pgsql_tmp, it is possible to always exclude pgsql_tmp and make only snapshots optional. Please let us know what you think.

…ots dirs.

If there are many large file size temporary files or logical replication
snapshot files in the database cluster directory, it will take longer to
get a full backup.  These files are not likely to be used after the restore.
Therefore, I have added an option to exclude these files when obtaining
full backups.
@harukat
Copy link
Author

harukat commented Dec 20, 2024

It was pointed out to me that pgsql_tmp is already excluded in the current code, so I'll fixed it.

In the most recent commit, I added a --exclude-tmpdir option that also
excludes snapshots and pgsql_tmp directory, but since pgsql_tmp was
already always excluded, I limited it to only excluding snapshots.
@harukat harukat changed the title Add --exclude-tmpdir option, which makes exclude pgsql_tmp and snapshots dirs Add --exclude-snapshots option, which makes exclude pg_logical/snapshots dirs Dec 20, 2024
@harukat harukat changed the title Add --exclude-snapshots option, which makes exclude pg_logical/snapshots dirs Add --exclude-snapshots option, which makes exclude pg_logical/snapshots Dec 20, 2024
@shinyaaa shinyaaa self-requested a review January 24, 2025 08:23
Copy link
Contributor

@shinyaaa shinyaaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I have some comments.

I'd like you to add option test, like below.

pg_rman/sql/option.sh

Lines 179 to 184 in d1b5c57

echo '###### COMMAND OPTION TEST-0019 ######'
echo '###### invalid value in pg_rman.ini ######'
init_catalog
echo "HARD_COPY=FOO" >> ${BACKUP_PATH}/pg_rman.ini
pg_rman backup -B ${BACKUP_PATH} -A ${ARCLOG_PATH} -b full -p ${TEST_PGPORT};echo $?
echo ''

And env is need to be initialized.

pg_rman/sql/common.sh

Lines 9 to 30 in d1b5c57

# Unset environment variables usable by both Postgres and pg_rman
unset PGUSER
unset PGPORT
unset PGDATABASE
unset COMPRESS_DATA
unset BACKUP_MODE
unset ARCLOG_PATH
unset BACKUP_PATH
unset SRVLOG_PATH
unset WITH_SERVLOG
unset SMOOTH_CHECKPOINT
unset KEEP_DATA_GENERATIONS
unset KEEP_DATA_DAYS
unset KEEP_ARCLOG_FILES
unset KEEP_ARCLOG_DAYS
unset KEEP_SRVLOG_FILES
unset KEEP_SRVLOG_DAYS
unset RECOVERY_TARGET_TIME
unset RECOVERY_TARGET_XID
unset RECOVERY_TARGET_INCLUSIVE
unset RECOVERY_TARGET_TIMELINE
unset HARD_COPY

Comment on lines +215 to +218
if (exclude_snapshots)
{
pgdata_exclude[i++] = SNAPSHOTS_DIR;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (exclude_snapshots)
{
pgdata_exclude[i++] = SNAPSHOTS_DIR;
}
if (exclude_snapshots)
pgdata_exclude[i++] = SNAPSHOTS_DIR;

<li><strong><code>--exclude-snapshots</code></strong>

<ul>
<li>一時ファイル用に使われる <code>pgsql_tmp</code> ディレクトリ内と、ロジカルレプリケーションで使われる <code>snapshots</code> ディレクトリ内をバックアップ対象から除外します。</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<li>一時ファイル用に使われる <code>pgsql_tmp</code> ディレクトリ内と、ロジカルレプリケーションで使われる <code>snapshots</code> ディレクトリ内をバックアップ対象から除外します。</li>
<li>ロジカルレプリケーションで使われる <code>$PGDATA/pg_logical/snapshots</code> ディレクトリ内をバックアップ対象から除外します。</li>

<td>&ndash;exclude-snapshots</td>
<td>EXCLUDE_TMPDIR</td>
<td>指定可</td>
<td>ロジカルスナップショットをバックアップから除外</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure "ロジカルスナップショット" is general representation.

Suggested change
<td>ロジカルスナップショットをバックアップから除外</td>
<td>ロジカルレプリケーションのスナップショットをバックアップから除外</td>

<li><strong><code>--exclude-snapshots</code></strong>

<ul>
<li>Exclude <code>snapshots</code> directory, which is used for logical replication.</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<li>Exclude <code>snapshots</code> directory, which is used for logical replication.</li>
<li>Exclude <code>$PGDATA/pg_logical/snapshots</code> directory which is used for logical replication.</li>

<tr>
<td></td>
<td>&ndash;exclude-snapshots</td>
<td>EXCLUDE_TMPDIR</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This env name is confuse us.

Suggested change
<td>EXCLUDE_TMPDIR</td>
<td>EXCLUDE_SNAPSHOT_DIR</td>

@@ -40,6 +40,7 @@ Backup options:
--keep-srvlog-days=DAY keep serverlog modified in DAY days
--standby-host=HOSTNAME standby host when taking backup from standby
--standby-port=PORT standby port when taking backup from standby
--exclude-snapshots exclude pg_logical/snapshots
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--exclude-snapshots exclude pg_logical/snapshots
--exclude-snapshots exclude $PGDATA/pg_logical/snapshots

@@ -17,6 +17,7 @@
const char *PROGRAM_VERSION = "1.3.16";
const char *PROGRAM_URL = "http://github.com/ossc-db/pg_rman";
const char *PROGRAM_ISSUES = "http://github.com/ossc-db/pg_rman/issues";
const char *SNAPSHOTS_DIR = "snapshots";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be below.

pg_rman/pg_rman.h

Lines 27 to 45 in d1b5c57

/* Directory/File names */
#define DATABASE_DIR "database"
#define ARCLOG_DIR "arclog"
#define SRVLOG_DIR "srvlog"
#define RESTORE_WORK_DIR "backup"
#define PG_XLOG_DIR "pg_wal"
#define PG_TBLSPC_DIR "pg_tblspc"
#define TIMELINE_HISTORY_DIR "timeline_history"
#define BACKUP_INI_FILE "backup.ini"
#define PG_RMAN_INI_FILE "pg_rman.ini"
#define SYSTEM_IDENTIFIER_FILE "system_identifier"
#define MKDIRS_SH_FILE "mkdirs.sh"
#define DATABASE_FILE_LIST "file_database.txt"
#define ARCLOG_FILE_LIST "file_arclog.txt"
#define SRVLOG_FILE_LIST "file_srvlog.txt"
#define SNAPSHOT_SCRIPT_FILE "snapshot_script"
#define PG_BACKUP_LABEL_FILE "backup_label"
#define PG_TBLSPC_MAP_FILE "tablespace_map"
#define PG_BLACK_LIST "black_list"

@@ -32,6 +33,7 @@ bool check = false;

/* directory configuration */
pgBackup current;
static bool exclude_snapshots = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is backup configuration. For example, directory options are -D, -A, -B, -S, and -G.

@@ -97,6 +99,7 @@ static pgut_option options[] =
{ 's', 10, "recovery-target-timeline" , &target_tli_string, SOURCE_ENV },
{ 's', 11, "recovery-target-action" , &target_action , SOURCE_ENV },
{ 'b', 12, "hard-copy" , &is_hard_copy , SOURCE_ENV },
{ 'b', 15, "exclude-snapshots", &exclude_snapshots, SOURCE_ENV },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is backup configuration.

@@ -289,6 +296,7 @@ pgut_help(bool details)
printf(_(" --keep-srvlog-days=DAY keep serverlog modified in DAY days\n"));
printf(_(" --standby-host=HOSTNAME standby host when taking backup from standby\n"));
printf(_(" --standby-port=PORT standby port when taking backup from standby\n"));
printf(_(" --exclude-snapshots exclude pg_logical/snapshots\n"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
printf(_(" --exclude-snapshots exclude pg_logical/snapshots\n"));
printf(_(" --exclude-snapshots exclude $PGDATA/pg_logical/snapshots\n"));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants