hg-sig

A tool to create a human readable signature for mercurial repositories

Overview

This tool is used to given an overview of a mercurial repository or to help to compare two mercurial repositories.

It does this by creating a long text that contains all patches, their log messages and their mercurial node IDs or checksums. This text contains enough information to find all relevant differences between repositories.

Background

In mercurial, a working copy always contains a complete repository. It is part of your normal workflow to have several repositories at the same time. Mercurial mq patches add even more flexibility, you can add or remove or reorder patches. Although each mercurial patch has a unique node id, applying or reordering mq patches changes these node ids. If you want to compare repositories with mq patches, node ids are not sufficient.

For this reason, this too uses an MD5 checksum on the output of the command "hg diff --nodate -g" to create checksums for patches. mq patches that have the same checksum are identical even if they have different node ids.

Calculating checksums is "expensive", it takes a long time if you have many patches. For this reason, the tool usually only computes checksums for mq patches and not for regular mercurial patches. If two repositories are compared, checksums are calculated from the first patch that differs to the topmost patch.

Another possibility to speed up the program is to use a cache file. This cache file contains checksums for a list of node ids. If a node is found in the file, the program doesn't calculate the checksum again.

Output

The text the program generates from a repository is called a signature. It consists of sections, each separated by a line consisting of "=" characters.

Status section

This section starts with the line "STATUS:". It contains the output of the command "hg status". You can see which files were added, modified or deleted. You also see files unknown to mercurial with a "?" in the first row. Here is an example:

============================================================
STATUS:
------------------------------------------------------------
? NEW-sorted.db
? NEW.db
? OLD-sorted.db
? OLD.db
M idcpApp/tables/idcp13_gap2cc2.tab
M idcpApp/tables/idcp13_gap2cc3.tab
M idcpApp/tables/idcp13_gap2cc4.tab

For an explanation of the characters in the first row see "hg -v help status".

Unrecorded

This section starts with the line "UNRECORDED:". This word is followed by the checksum of all unrecorded changes, these are changes that can be seen by mercurial but are not yet committed. If there are no unrecorded changes, this section is omitted. Here is an example:

============================================================
UNRECORDED: e394f876688059c4eb08a44ba1cbea5c

Identify

This section starts with the line "IDENTIFY:". It contains some of the output of the command "hg identify". You always see here the node id of the working copy and, if it was computed, the checksum of that node. Here is an example:

============================================================
IDENTIFY: 95d61d497932 checksum: b63bd6c550cf1d95d97ca0dd34a93b32

Patches

This section starts with the line "PATCHES". It contains a text for each patch (version) of the repository. The patches are separated by lines containing only the "-" character. Each text starts with the patch log message followed by several fields. The fields are:

FILES
The list of modified/added/removed files, possibly more than one line.
TAGS
If the patch has tags, this field lists them in a single line.
CHECKSUM
The checksum of the patch, if it was calculated.
NODE
The mercurial node id of the patch. If there was a checksum calculated ("CHECKSUM") this is not printed.

Here is an example:

============================================================
PATCHES:
------------------------------------------------------------
New correction coil tables for the U49-1 were supplied by W.F.

FILES: idcpApp/tables/idcp7_gap2cc2.tab idcpApp/tables/idcp7_gap2cc3.tab
      idcpApp/tables/idcp7_gap2cc4.tab idcpApp/tables/idcp7_gap2cc5.tab
CHECKSUM: 142da4f9665ea8cff0f649a0b2e650d7
------------------------------------------------------------
A minimum velocity was added to the application.

This minimum velocity can be specified for all drives for all insertion
devices. It is the smalles value a velocity can have.

These are the three optional parameters added to the StructuredData file:

v_min_velocity, h_min_velocity and c_min_velocity for the gap, the shift drive
and the chicane.

The global variables added to configure/configure.c are cnf_minvelocity,
cnf_minvelocity2 and cnf_minvelocity3.

FILES: idcpApp/configure/configure.c idcpApp/diag/diag.c
      idcpApp/tables/idcp_config.pyx

CHECKSUM: 2b5f405c2d38835654040497d47b2b32
------------------------------------------------------------
protocols/accp-gen.py was changed for the new version of the id_db2.py.

Function id_db2.all_ids() is deprecated and should be replaced by
id_db2.all_idcp_keys(). This change has now been applied to accp-gen.py.

FILES: config/config.yaml idcpApp/protocols/accp-gen.py
NODE    : 73956178341a

Modes of operation

This tool can be applied to a single mercurial repository in order to create a signature text or it can be applied to two mercurial repositories in order to be able to compare the two generated signatures. In this case the signatures can be stored in two files or they can be put to temporary files that are compared with a GUI diff viewer like tkdiff, kompare or meld.

Quick reference

Reference of command line options

The program takes commands and options. Commands are simple words where options always start with "-" or "--". Commands may be abbreviated, e.g. "sig" can be used instead of "signature" or "gcomp" instead of "gcompare". The following text always uses the long non abbreviated command names:

Commands

print

You use this in the form:

hg-sig print {repository}

This creates a signature for the given repository. If {repository} is omitted the program uses the repository in the current working directory.

compare

You use this in the form:

hg-sig compare [repository1] [repository2] [file1] [file2]

This creates two files, [file1] and [file2] for the two repositores. It generates checksums from the first patch that is different between the repositories up to the top patch.

gcompare

You use this in the form:

hg-sig compare [repository1] [repository2] {compareprogram}

This command is similar to "compare" with the difference that the created signature files are created as temporary files. The program {compareprogram} is then called to display the differences between these files. If {compareprogram} is omitted, the program calls "meld".

Options

Here is a list of all command line options:

--version show program's version number and exit
-h, --help show this help message and exit
--summary Print a summary of the function of the program.
--doc Create online help in restructured textformat. Use "./hg-sig --doc | rst2html" to create html-help.
--cache=CACHEFILE
 Specify a cache file for patch checksums. This file is not modified by the program.
--wcache=CACHEFILE
 Specify a cache file for patch checksums. Rewrite this file with the checksums that were present in the (first) repository.
--skip-common For "compare", begin printing patches at the first difference instead of the first patch.
--no-tags Do not show tags.
--nodes Print node ids for all patches, even patches where a checksum was created.
--checksum-start=REV
 Start the checksum calculation at revision REV.
-x EXTRA, --extra=EXTRA
 Pass the given string as an extra option to mercurial when "hg log" is called.
-p, --progress Show progress on stderr.
-v, --verbose Show command calls.
-n, --dry-run Just show what the program would do.