pt-heartbeat¶
NAME¶
pt-heartbeat - Monitor MySQL replication delay.
SYNOPSIS¶
Usage¶
pt-heartbeat [OPTIONS] [DSN] --update|--monitor|--check|--stop
pt-heartbeat measures replication lag on a MySQL or PostgreSQL server. You can use it to update a replication source or monitor a replica. If possible, MySQL connection options are read from your .my.cnf file.
Start daemonized process to update test.heartbeat table on replication source:
pt-heartbeat -D test --update -h source-server --daemonize
Monitor replication lag on replica:
pt-heartbeat -D test --monitor -h replica-server
pt-heartbeat -D test --monitor -h replica-server --dbi-driver Pg
Check replica lag once and exit (using optional DSN to specify replica host):
pt-heartbeat -D test --check h=replica-server
RISKS¶
Percona Toolkit is mature, proven in the real world, and well tested, but all database tools can pose a risk to the system and the database server. Before using this tool, please:
Read the tool’s documentation
Review the tool’s known “BUGS”
Test the tool on a non-production server
Backup your production server and verify the backups
DESCRIPTION¶
pt-heartbeat is a two-part MySQL and PostgreSQL replication delay monitoring
system that measures delay by looking at actual replicated data. This
avoids reliance on the replication mechanism itself, which is unreliable. (For
example, SHOW REPLICA STATUS
on MySQL).
The first part is an --update
instance of pt-heartbeat that connects to
a replication source and updates a timestamp (“heartbeat record”) every
--interval
seconds. Since the heartbeat table may contain records from
multiple replication sources (see “MULTI-REPLICA HIERARCHY”), the server’s ID
(@@server_id) is used to identify records.
The second part is a --monitor
or --check
instance of pt-heartbeat
that connects to a replica, examines the replicated heartbeat record from its
immediate source or the specified --source-server-id
, and computes the
difference from the current system time. If replication between the replica and
the source is delayed or broken, the computed difference will be greater than
zero and potentially increase if --monitor
is specified.
You must either manually create the heartbeat table on the replication source or
use --create-table
. See --create-table
for the proper heartbeat
table structure. The MEMORY
storage engine is suggested, but not
required of course, for MySQL.
The heartbeat table must contain a heartbeat row. By default, a heartbeat
row is inserted if it doesn’t exist. This feature can be disabled with the
--[no]insert-heartbeat-row
option in case the database user does not
have INSERT privileges.
pt-heartbeat depends only on the heartbeat record being replicated to the replica, so it works regardless of the replication mechanism (built-in replication, a system such as Continuent Tungsten, etc). It works at any depth in the replication hierarchy; for example, it will reliably report how far a replica lags its source’s source’s source. And if replication is stopped, it will continue to work and report (accurately!) that the replica is falling further and further behind the source.
pt-heartbeat has a maximum resolution of 0.01 second. The clocks on the
source and replica servers must be closely synchronized via NTP. By default,
--update
checks happen on the edge of the second (e.g. 00:01) and
--monitor
checks happen halfway between seconds (e.g. 00:01.5).
As long as the servers’ clocks are closely synchronized and replication
events are propagating in less than half a second, pt-heartbeat will report
zero seconds of delay.
pt-heartbeat will try to reconnect if the connection has an error, but will not retry if it can’t get a connection when it first starts.
The --dbi-driver
option lets you use pt-heartbeat to monitor PostgreSQL
as well. It is reported to work well with Slony-1 replication.
MULTI-REPLICA HIERARCHY¶
If the replication hierarchy has multiple replicas which are sources of
other replicas, like “source -> replica1 -> replica2”, --update
instances
can be ran on the replicas as well as the source. The default heartbeat
table (see --create-table
) is keyed on the server_id
column, so
each server will update the row where server_id=@@server_id
.
For --monitor
and --check
, if --source-server-id
is not
specified, the tool tries to discover and use the replica’s immediate source.
If this fails, or if you want monitor lag from another source, then you can
specify the --source-server-id
to use.
For example, if the replication hierarchy is “source -> replica1 -> replica2” with corresponding server IDs 1, 2 and 3, you can:
pt-heartbeat --daemonize -D test --update -h source
pt-heartbeat --daemonize -D test --update -h replica1
Then check (or monitor) the replication delay from source to replica2:
pt-heartbeat -D test --source-server-id 1 --check replica2
Or check the replication delay from replica1 to replica2:
pt-heartbeat -D test --source-server-id 2 --check replica2
Stopping the --update
instance one replica1 will not affect the instance
on the source.
The default heartbeat table (see --create-table
) has columns for saving
information from SHOW BINARY LOG STATUS
(SHOW MASTER STATUS
before MySQL 8.4)
and SHOW REPLICA STATUS
. These columns are optional. If any are present,
their corresponding information will be saved.
Percona XtraDB Cluster¶
Although pt-heartbeat should work with all supported versions of Percona XtraDB Cluster (PXC), we recommend using 5.5.28-23.7 and newer.
If you are setting up heartbeat instances between cluster nodes, keep in mind that, since the speed of the cluster is determined by its slowest node, pt-heartbeat will not report how fast the cluster itself is, but only how fast events are replicating from one node to another.
You must specify --source-server-id
for --monitor
and --check
instances.
OPTIONS¶
Specify at least one of --stop
, --update
, --monitor
, or --check
.
--update
, --monitor
, and --check
are mutually exclusive.
--daemonize
and --check
are mutually exclusive.
This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
- --ask-pass¶
Prompt for a password when connecting to MySQL.
- --charset¶
short form: -A; type: string
Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
- --check¶
Check replica delay once and exit. If you also specify
--recurse
, the tool will try to discover replica’s of the given replica and check and print their lag, too. The hostname or IP and port for each replica is printed before its delay.--recurse
only works with MySQL.
- --check-read-only¶
Check if the server has read_only enabled; If it does, the tool skips doing any inserts. See also
--read-only-interval
- --config¶
type: Array
Read this comma-separated list of config files; if specified, this must be the first option on the command line.
- --create-table¶
Create the heartbeat
--table
if it does not exist.This option causes the table specified by
--database
and--table
to be created with the following MAGIC_create_heartbeat table definition:CREATE TABLE heartbeat ( ts varchar(26) NOT NULL, server_id int unsigned NOT NULL PRIMARY KEY, file varchar(255) DEFAULT NULL, -- SHOW BINARY LOG STATUS position bigint unsigned DEFAULT NULL, -- SHOW BINARY LOG STATUS relay_source_log_file varchar(255) DEFAULT NULL, -- SHOW REPLICA STATUS exec_source_log_pos bigint unsigned DEFAULT NULL -- SHOW REPLICA STATUS );
The heartbeat table requires at least one row. If you manually create the heartbeat table, then you must insert a row by doing:
INSERT INTO heartbeat (ts, server_id) VALUES (NOW(), N);
or if using
--utc
:INSERT INTO heartbeat (ts, server_id) VALUES (UTC_TIMESTAMP(), N);
where
N
is the server’s ID; do not use @@server_id because it will replicate and replicas will insert their own server ID instead of the source’s server ID.This is done automatically by
--create-table
.A legacy version of the heartbeat table is still supported:
CREATE TABLE heartbeat ( id int NOT NULL PRIMARY KEY, ts datetime NOT NULL );
Legacy tables do not support
--update
instances on each replica of a multi-replica hierarchy like “source -> replica1 -> replica2”. To manually insert the one required row into a legacy table:INSERT INTO heartbeat (id, ts) VALUES (1, NOW());
or if using
--utc
:INSERT INTO heartbeat (id, ts) VALUES (1, UTC_TIMESTAMP());
The tool automatically detects if the heartbeat table is legacy.
See also “MULTI-REPLICA HIERARCHY”.
- --create-table-engine¶
type: string
Sets the engine to be used for the heartbeat table. The default storage engine is InnoDB as of MySQL 5.5.5.
- --daemonize¶
Fork to the background and detach from the shell. POSIX operating systems only.
- --database¶
short form: -D; type: string
The database to use for the connection.
- --dbi-driver¶
default: mysql; type: string
Specify a driver for the connection;
mysql
andPg
are supported.
- --defaults-file¶
short form: -F; type: string
Only read mysql options from the given file. You must give an absolute pathname.
- --file¶
type: string
Print latest
--monitor
output to this file.When
--monitor
is given, prints output to the specified file instead of to STDOUT. The file is opened, truncated, and closed every interval, so it will only contain the most recent statistics. Useful when--daemonize
is given.
- --frames¶
type: string; default: 1m,5m,15m
Timeframes for averages.
Specifies the timeframes over which to calculate moving averages when
--monitor
is given. Specify as a comma-separated list of numbers with suffixes. The suffix can be s for seconds, m for minutes, h for hours, or d for days. The size of the largest frame determines the maximum memory usage, as up to the specified number of per-second samples are kept in memory to calculate the averages. You can specify as many timeframes as you like.
- --help¶
Show help and exit.
- --host¶
short form: -h; type: string
Connect to host.
- --[no]insert-heartbeat-row¶
default: yes
Insert a heartbeat row in the
--table
if one doesn’t exist.The heartbeat
--table
requires a heartbeat row, else there’s nothing to--update
,--monitor
, or--check
! By default, the tool will insert a heartbeat row if one is not already present. You can disable this feature by specifying--no-insert-heartbeat-row
in case the database user does not have INSERT privileges.
- --interval¶
type: float; default: 1.0
How often to update or check the heartbeat
--table
. Updates and checks begin on the first whole second then repeat every--interval
seconds for--update
and every--interval
plus--skew
seconds for--monitor
.For example, if at 00:00.4 an
--update
instance is started at 0.5 second intervals, the first update happens at 00:01.0, the next at 00:01.5, etc. If at 00:10.7 a--monitor
instance is started at 0.05 second intervals with the default 0.5 second--skew
, then the first check happens at 00:11.5 (00:11.0 + 0.5) which will be--skew
seconds after the last update which, because the instances are checking at synchronized intervals, happened at 00:11.0.The tool waits for and begins on the first whole second just to make the interval calculations simpler. Therefore, the tool could wait up to 1 second before updating or checking.
The minimum (fastest) interval is 0.01, and the maximum precision is two decimal places, so 0.015 will be rounded to 0.02.
If a legacy heartbeat table (see
--create-table
) is used, then the maximum precision is 1s because thets
column is typedatetime
.
- --log¶
type: string
Print all output to this file when daemonized.
- --master-server-id¶
type: string
This option is deprecated and will be removed in future releases. Use
--source-server-id
instead.
- --source-server-id¶
type: string
Calculate delay from this source server ID for
--monitor
or--check
. If not given, pt-heartbeat attempts to connect to the server’s source and determine its server id.
- --monitor¶
Monitor replica delay continuously.
Specifies that pt-heartbeat should check the replica’s delay every second and report to STDOUT (or if
--file
is given, to the file instead). The output is the current delay followed by moving averages over the timeframe given in--frames
. For example,5s [ 0.25s, 0.05s, 0.02s ]
- --fail-successive-errors¶
type: int
If specified, pt-heartbeat will fail after given number of successive DBI errors (failure to connect to server or issue a query).
- --password¶
short form: -p; type: string
Password to use when connecting. If password contains commas they must be escaped with a backslash: “exam,ple”
- --pid¶
type: string
Create the given PID file. The tool won’t start if the PID file already exists and the PID it contains is different than the current PID. However, if the PID file exists and the PID it contains is no longer running, the tool will overwrite the PID file with the current PID. The PID file is removed automatically when the tool exits.
- --port¶
short form: -P; type: int
Port number to use for connection.
- --print-master-server-id¶
This option is deprecated and will be removed in future releases. Use
--print-source-server-id
instead.
- --print-source-server-id¶
Print the auto-detected or given
--source-server-id
. If--check
or--monitor
is specified, specifying this option will print the auto-detected or given--source-server-id
at the end of each line.
- --read-only-interval¶
type: int
When
--check-read-only
is specified, the interval to sleep while the server is found to be read-only. If unspecified,--interval
is used.
- --recurse¶
type: int
Check replicas recursively to this depth in
--check
mode.Try to discover replica servers recursively, to the specified depth. After discovering servers, run the check on each one of them and print the hostname (if possible), followed by the replica delay.
This currently works only with MySQL. See
--recursion-method
.
- --recursion-method¶
type: array; default: processlist,hosts
Preferred recursion method used to find replicas.
Possible methods are:
METHOD USES =========== ================== processlist SHOW PROCESSLIST hosts SHOW REPLICA HOSTS none Do not find replicas
The processlist method is preferred because SHOW REPLICA HOSTS is not reliable. However, the hosts method is required if the server uses a non-standard port (not 3306). Usually pt-heartbeat does the right thing and finds the replicas, but you may give a preferred method and it will be used first. If it doesn’t find any replicas, the other methods will be tried.
- --replace¶
Use
REPLACE
instead ofUPDATE
for –update.When running in
--update
mode, useREPLACE
instead ofUPDATE
to set the heartbeat table’s timestamp. TheREPLACE
statement is a MySQL extension to SQL. This option is useful when you don’t know whether the table contains any rows or not. It must be used in conjunction with –update.
- --run-time¶
type: time
Time to run before exiting.
- --sentinel¶
type: string; default: /tmp/pt-heartbeat-sentinel
Exit if this file exists.
- --slave-user¶
type: string
This option is deprecated and will be removed in future releases. Use
--replica-user
instead.
- --slave-password¶
type: string
This option is deprecated and will be removed in future releases. Use
--replica-password
instead.
- --replica-user¶
type: string
Sets the user to be used to connect to the replicas. This parameter allows you to have a different user with less privileges on the replicas but that user must exist on all replicas.
- --replica-password¶
type: string
Sets the password to be used to connect to the replicas. It can be used with –replica-user and the password for the user must be the same on all replicas.
- --set-vars¶
type: Array
Set the MySQL variables in this comma-separated list of
variable=value
pairs.By default, the tool sets:
wait_timeout=10000
Variables specified on the command line override these defaults. For example, specifying
--set-vars wait_timeout=500
overrides the defaultvalue of10000
.The tool prints a warning and continues if a variable cannot be set.
- --skew¶
type: float; default: 0.5
How long to delay checks.
The default is to delay checks one half second. Since the update happens as soon as possible after the beginning of the second on the replication source, this allows one half second of replication delay before reporting that the replica lags the source by one second. If your clocks are not completely accurate or there is some other reason you’d like to delay the replica more or less, you can tweak this value. Try setting the
PTDEBUG
environment variable to see the effect this has.
- --socket¶
short form: -S; type: string
Socket file to use for connection.
- --stop¶
Stop running instances by creating the sentinel file.
This should have the effect of stopping all running instances which are watching the same sentinel file. If none of
--update
,--monitor
or--check
is specified, pt-heartbeat will exit after creating the file. If one of these is specified, pt-heartbeat will wait the interval given by--interval
, then remove the file and continue working.You might find this handy to stop cron jobs gracefully if necessary, or to replace one running instance with another. For example, if you want to stop and restart pt-heartbeat every hour (just to make sure that it is restarted every hour, in case of a server crash or some other problem), you could use a
crontab
line like this:0 * * * * :program:`pt-heartbeat` --update -D test --stop \ --sentinel /tmp/pt-heartbeat-hourly
The non-default
--sentinel
will make sure the hourlycron
job stops only instances previously started with the same options (that is, from the samecron
job).See also
--sentinel
.
- --table¶
type: string; default: heartbeat
The table to use for the heartbeat.
Don’t specify database.table; use
--database
to specify the database.See
--create-table
.
- --update¶
Update a replication source’s heartbeat.
- --user¶
short form: -u; type: string
User for login if not current user.
- --utc¶
Ignore system time zones and use only UTC. By default pt-heartbeat does not check or adjust for different system or MySQL time zones which can cause the tool to compute the lag incorrectly. Specifying this option is a good idea because it ensures that the tool works correctly regardless of time zones.
If used, this option must be used for all pt-heartbeat instances:
--update
,--monitor
,--check
, etc. You should probably set the option in a--config
file. Mixing this option with pt-heartbeat instances not using this option will cause false-positive lag readings due to different time zones (unless all your systems are set to use UTC, in which case this option isn’t required).
- --version¶
Show version and exit.
- --[no]version-check¶
default: yes
Check for the latest version of Percona Toolkit, MySQL, and other programs.
This is a standard “check for updates automatically” feature, with two additional features. First, the tool checks its own version and also the versions of the following software: operating system, Percona Monitoring and Management (PMM), MySQL, Perl, MySQL driver for Perl (DBD::mysql), and Percona Toolkit. Second, it checks for and warns about versions with known problems. For example, MySQL 5.5.25 had a critical bug and was re-released as 5.5.25a.
A secure connection to Percona’s Version Check database server is done to perform these checks. Each request is logged by the server, including software version numbers and unique ID of the checked system. The ID is generated by the Percona Toolkit installation script or when the Version Check database call is done for the first time.
Any updates or known problems are printed to STDOUT before the tool’s normal output. This feature should never interfere with the normal operation of the tool.
For more information, visit https://www.percona.com/doc/percona-toolkit/LATEST/version-check.html.
DSN OPTIONS¶
These DSN options are used to create a DSN. Each option is given like
option=value
. The options are case-sensitive, so P and p are not the
same option. There cannot be whitespace before or after the =
and
if the value contains whitespace it must be quoted. DSN options are
comma-separated. See the percona-toolkit manpage for full details.
A
dsn: charset; copy: yes
Default character set.
D
dsn: database; copy: yes
Default database.
F
dsn: mysql_read_default_file; copy: yes
Only read default options from the given file
h
dsn: host; copy: yes
Connect to host.
p
dsn: password; copy: yes
Password to use when connecting. If password contains commas they must be escaped with a backslash: “exam,ple”
P
dsn: port; copy: yes
Port number to use for connection.
S
dsn: mysql_socket; copy: yes
Socket file to use for connection.
u
dsn: user; copy: yes
User for login if not current user.
s
dsn: mysql_ssl; copy: yes
Create SSL connection
ENVIRONMENT¶
The environment variable PTDEBUG
enables verbose debugging output to STDERR.
To enable debugging and capture all output to a file, run the tool like:
PTDEBUG=1 pt-heartbeat ... > FILE 2>&1
Be careful: debugging output is voluminous and can generate several megabytes of output.
ATTENTION¶
Using <PTDEBUG> might expose passwords. When debug is enabled, all command line parameters are shown in the output.
SYSTEM REQUIREMENTS¶
You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl.
BUGS¶
For a list of known bugs, see https://jira.percona.com/projects/PT/issues.
Please report bugs at https://jira.percona.com/projects/PT. Include the following information in your bug report:
Complete command-line used to run the tool
Tool
--version
MySQL version of all servers involved
Output from the tool including STDERR
Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG
;
see “ENVIRONMENT”.
DOWNLOADING¶
Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz
wget percona.com/get/percona-toolkit.rpm
wget percona.com/get/percona-toolkit.deb
You can also get individual tools from the latest release:
wget percona.com/get/TOOL
Replace TOOL
with the name of any tool.
ABOUT PERCONA TOOLKIT¶
This tool is part of Percona Toolkit, a collection of advanced command-line tools for MySQL developed by Percona. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and primarily developed by him and Daniel Nichter. Visit http://www.percona.com/software/ to learn about other free, open-source software from Percona.
COPYRIGHT, LICENSE, AND WARRANTY¶
This program is copyright 2007-2024 Percona LLC and/or its affiliates, 2006 Proven Scaling LLC and Six Apart Ltd.
Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue `man perlgpl’ or `man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
VERSION¶
pt-heartbeat 3.7.0