Thursday, May 7, 2009

What files install added or changed

It is useful to know which files were added, changed, or deleted by an installation.

A method that kind of works is to make a timestamp before the installation and then do a find with -cnewer (compared to the timestamp) after the installation. Using -cnewer rather than -newer can also catch any files that were installed with a -p (preserve timestamp option). A problem with the timestamp method is that you cannot tell whether a file was added or changed. For example, it might be unsafe to delete a file that was changed because something else might need it. If it was a re-installation, well, you should probably have uninstalled the old version first. Another example, is that if a file was changed, you might wonder why -- what if it overwrote a file that had been carefully customized.

Another method that seems to work is to do a find before the installation and a find after the installation and log the finds to files such as find.before and find.after. If the finds also contain information such as the inode, bytes, and ctime, it is possible to detect what was added, changed, or deteled. A diff with the -u option can be analyzed to get this information. Diff -u shows minus - for deletion, plus + for addition, and two lines for a change: the first line shows - for deletion and the second line shows + for addtion. It is possible to recalculate the two-line-change to show an exclamation point ! for a change instead of showing one minus - line for deletion followed by a plus + line for re-addition.

In order for the diff to accurately show what was changed by the installation, the find.before should be done immediately before the installation and the find.after should be done immediately after the installation, and there should be no disk activity in the dirs being checked other than the installation.

I like to do a find sort of like this
find / \
-regex ${rexpN} -prune -o \
! -path /usr/share/info/dir \
-printf "%p %s %y %i %n %c\n"
Where ${rexpN} is a conglomeration of dirs to exclude and winds up looking something crazy like this:
'\(^/proc\)\|\(^/dev\)\|\(^/sys\)\|\(^/tmp\)\|\(^/home\)\|\(^/root\)'
Depending on how the find gets executed, you may have to add a 2 backslashes before each backslash. In some cases, if the shell gets a hold of backslashes, it may seem like it eats as many as possible.

Here are 3 scripts to Print Unified Diff in more human format. The idea is basically like deleting the duplicate from 2 consecutive duplicate lines except that only the filename is tested for duplicate and we discard the minus - line and change the plus + line to an exclamation point ! line. A space is added to make the filename a separate field instead of having a plus, minus, or exclamation symbol stuck right to it.

The perl and awk versions have a self-contained sort.
The perl version is a modified version of a2p pud.awk.
The perl version can optionally accept 2 filenames and execute the diff itself.
The sed version must be piped to sort if sorted output is wanted.

AWK VERSION: pud.awk
Usage: diff -u find.before find.after | awk -f pud.awk
#!/usr/bin/awk -f
{
symb = substr($1, 1, 1)
line = substr($0, 2)
headr = symb symb
if ( symb == "+" || symb == "-" ) {
if ( substr($0, 1, 2) != headr) {
filename = substr($1, 2)
if ( filename in f ) {
f[filename] = "!" " " line
}
else {
f[filename] = symb " " line
}
}
}
}
END {
j = 1
for ( i in f ) {
s[j] = i # copy f index to unsorted array
j++
}
n = asort(s) # sort it
for (i = 1; i <= n; i++) {
print f[s[i]]
}
}

PERL VERSION: pud.pl
Usage1: diff -u find.before find.after | perl pud.pl
Usage2: pud.pl find.before find.after
#!/usr/bin/perl

@cmdout = ();
if (scalar @ARGV == 0) {
# no @ARGV, so assume they piped the diff command to us
@cmdout = (<>);
}
else {
# got @ARGV, so assume they want us to do the diff command
$before = $ARGV[0];
$after = $ARGV[1];
if (! stat($before) or ! stat($after)) {
die "Can't stat $before and $after";
}
if (open(CMD, "diff -u $before $after |")) {
@cmdout = <CMD>;
close(CMD);
}
}

#$[ = 1; # set array base to 1 (but don't do that)
$, = ' '; # set output field separator for print
$\ = "\n"; # set output record separator for print

#while (<>) {
foreach (@cmdout) {
chomp; # strip record separator
@Fld = split(' ', $_, -1);

$symb = substr($Fld[0], 0, 1);
$line = substr($_, 1);
$headr = $symb . $symb;
if ($symb eq '+' || $symb eq '-') {
if (substr($_, 0, 2) ne $headr) {
$filename = substr($Fld[0], 1);
if (defined $f{$filename}) {
$f{$filename} = '!' . ' ' . $line;
}
else {
$f{$filename} = $symb . ' ' . $line;
}
}
}
}

foreach $i (sort(keys %f)) {
print $f{$i};
}

SED VERSION: pud.sed
Usage: diff -u find.before find.after | sed -f pud.sed | LC_ALL=C sort -k 2
#!/bin/sed -f

# It was not that easy to get the sed-one-liner for deleting
# consecutive duplicate lines to do what I wanted.
# sed '$!N; /^\(.*\)\n\1$/!P; D'

# reject lines not beginnig with + or -
/^[^+-]/d
# reject lines beginnig with +++
/^[+][+][+]/d
# reject lines beginnig with ---
/^[-][-][-]/d

$! {
# Add a newline to the pattern space,
# then append the next line of input to the pattern space.
N
}

# If the pattern space does not have a line beginning with -
# followed by a line beginning with +
/^[-]\(\S*\).*\n[+]\1.*$/! {
# put a space after the + or -
s/^\([+-]\)/\1 /
# Print the portion of the pattern space up to the first newline.
P
# Delete text in the pattern space up to the first newline. If any
# text is left, restart cycle with the resultant pattern space
# (without reading a new line of input), otherwise start a normal
# new cycle.
D
}

# If the pattern space does have a line beginning with -
# followed by a line beginning with +
# fix up the - line and the + line look like a ! change line.
/^[-]\(\S*\).*\n[+]\1.*$/ {
# Throw away the part beginning with - up through the newline.
s/^.*\n//
# Change the + to ! followed by a space in the remaining part.
s/^[+]/! /
}

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

Click blog title for the latest post