EXIMSTATSSection: EXIM (8)Updated: 2004-10-07 |
EXIMSTATSSection: EXIM (8)Updated: 2004-10-07 |
eximstats [Options] mainlog1 mainlog2 ... > report.txt eximstats -merge [Options] report.1.txt report.2.txt ... > weekly_report.txt
Options:
0, 1, 2, 3, 5, 10, 15, 20, 30 or 60.
-pattern 'Refused connections' '/refused connection/'
eximstats mainlog.sun > report.sun.txt eximstats mainlog.mon > report.mon.txt eximstats mainlog.tue > report.tue.txt eximstats mainlog.wed > report.web.txt eximstats mainlog.thu > report.thu.txt eximstats mainlog.fri > report.fri.txt eximstats mainlog.sat > report.sat.txt eximstats -merge report.*.txt > weekly_report.txt eximstats -merge -html report.*.txt > weekly_report.html
This requires the following modules which can be obtained from http://www.cpan.org/modules/01modules.index.html
To install these, download and unpack them, then use the normal perl installation procedure:
perl Makefile.PL make make test make install
$rounded_volume = volume_rounded($bytes,$gigabytes);
Given a data size in bytes, round it to KB, MB, or GB as appropriate.
Eg 12000 => 12KB, 15000000 => 14GB, etc.
Note: I've experimented with Math::BigInt and it results in a 33% performance degredation as opposed to storing numbers split into bytes and gigabytes.
un_round($rounded_volume,\$bytes,\$gigabytes);
Given a volume in KB, MB or GB, as generated by volume_rounded(), do the reverse transformation and convert it back into Bytes and Gigabytes. These are added to the $bytes and $gigabytes parameters.
Given a data size in bytes, round it to KB, MB, or GB as appropriate.
EG: 500 => (500,0), 14GB => (0,14), etc.
add_volume(\$bytes,\$gigs,$size);
Add $size to $bytes/$gigs where this is a number split into bytes ($bytes) and gigabytes ($gigs). This is significantly faster than using Math::BigInt.
$formatted_time = format_time($seconds);
Given a time in seconds, break it down into weeks, days, hours, minutes, and seconds.
$seconds = unformat_time($formatted_time);
Given a time in weeks, days, hours, minutes, or seconds, convert it to seconds.
$time = seconds($timestamp);
Given a time-of-day timestamp, convert it into a time() value using POSIX::mktime. We expect the timestamp to be of the form ``$year-$mon-$day $hour:$min:$sec'', with month going from 1 to 12, and the year to be absolute (we do the necessary conversions). The timestamp may be followed with an offset from UTC like ``+$hh$mm''; if the offset is not present, and we have not been told that the log is in UTC (with the -utc option), then we adjust the time by the current local time offset so that it can be compared with the time recorded in message IDs, which is UTC.
To improve performance, we only use mktime on the date ($year-$mon-$day), and only calculate it if the date is different to the previous time we came here. We then add on seconds for the '$hour:$min:$sec'.
We also store the results of the last conversion done, and only recalculate if the date is different.
We used to have the '-cache' flag which would store the results of the mktime() call. However, the current way of just using mktime() on the date obsoletes this.
$time = id_seconds($message_id);
Given a message ID, convert it into a time() value.
$localtime_offset = calculate_localtime_offset();
Calculate the the localtime offset from gmtime in seconds.
$localtime = time() + $localtime_offset.
These are the same semantics as ISO 8601 and RFC 2822 timezone offsets. (West is negative, East is positive.)
$time = print_queue_times($message_type,\@queue_times,$queue_more_than);
Given the type of messages being output, the array of message queue times, and the number of messages which exceeded the queue times, print out a table.
print_histogram('Deliverieds|Messages received',@interval_count);
Print a histogram of the messages delivered/received per time slot (hour by default).
print_league_table($league_table_type,\%message_count,\%message_data,\%message_data_gigs);
Given hashes of message count and message data, which are keyed by the table type (eg by the sending host), print a league table showing the top $topcount (defaults to 50).
@sorted_keys = top_n_sort($n,$href1,$href2,$href3);
Given a hash which has numerical values, return the sorted $n keys which point to the top values. The second and third hashes are used as tiebreakers. They all must have the same keys.
The idea behind this routine is that when you only want to see the top n members of a set, rather than sorting the entire set and then plucking off the top n, sort through the stack as you go, discarding any member which is lower than your current n'th highest member.
This proves to be an order of magnitude faster for large hashes. On 200,000 lines of mainlog it benchmarked 9 times faster. On 700,000 lines of mainlog it benchmarked 13.8 times faster.
$header = html_header($title);
Print our HTML header and start the <body> block.
help();
Display usage instructions and exit.
$parser = generate_parser();
This subroutine generates the parsing routine which will be used to parse the mainlog. We take the base operation, and remove bits not in use. This improves performance depending on what bits you take out or add.
I've tested using study(), but this does not improve performance.
We store our parsing routing in a variable, and process it looking for #IFDEF (Expression) or #IFNDEF (Expression) statements and corresponding #ENDIF (Expression) statements. If the expression evaluates to true, then it is included/excluded accordingly.
parse($parser,\*FILEHANDLE);
This subroutine accepts a parser and a filehandle from main and parses each line. We store the results into global variables.
print_header();
Print our headers and contents.
print_grandtotals();
print_user_patterns();
Print the counts of user specified patterns.
print_transport();
print_relay();
print_errors();
Print our errors. In HTML, we display them as a list rather than a table - Netscape doesn't like large tables!
parse_old_eximstat_reports($fh);
Parse old eximstat output so we can merge daily stats to weekly stats and weekly to monthly etc.
To test that the merging still works after changes, do something like the following. All the diffs should produce no output.
options='-bydomain -byemail -byhost -byedomain' options="$options -pattern 'Completed Messages' /Completed/" options="$options -pattern 'Received Messages' /<=/"
./eximstats $options mainlog > mainlog.txt ./eximstats $options -merge mainlog.txt > mainlog.2.txt diff mainlog.txt mainlog.2.txt
./eximstats $options -html mainlog > mainlog.html ./eximstats $options -merge -html mainlog.txt > mainlog.2.html diff mainlog.html mainlog.2.html
./eximstats $options -merge mainlog.html > mainlog.3.txt diff mainlog.txt mainlog.3.txt
./eximstats $options -merge -html mainlog.html > mainlog.3.html diff mainlog.html mainlog.3.html
./eximstats $options -nvr mainlog > mainlog.nvr.txt ./eximstats $options -merge mainlog.nvr.txt > mainlog.4.txt diff mainlog.txt mainlog.4.txt
# double_mainlog.txt should have twice the values that mainlog.txt has. ./eximstats $options mainlog mainlog > double_mainlog.txt
update_relayed($count,$sender,$recipient);
Adds an entry into the %relayed hash. Currently only used when merging reports.
add_to_totals(\%totals,\@keys,$values);
Given a line of space seperated values, add them into the provided hash using @keys as the hash keys.
If the value contains a '%', then the value is set rather than added. Otherwise, we convert the value to bytes and gigs. The gigs get added to Key-gigs.
$total = get_report_total(\%hash,$key);
If %hash contains values split into Units and Gigs, we calculate and return
$hash{$key} + 1024*1024*1024 * $hash{"${key}-gigs"}
$text_line = html2txt($html_line);
Convert a line from html to text. Currently we just convert HTML tags to spaces and convert >, <, and tags back.
$arg = get_next_arg();
Because eximstats arguments are often passed as variables, we can't rely on shell parsing to deal with quotes. This subroutine returns $ARGV[1] and does a shift. If $ARGV[1] starts with a quote (' or "), and doesn't end in one, then we append the next argument to it and shift again. We repeat until we've got all of the argument.
This isn't perfect as all white space gets reduced to one space, but it's as good as we can get! If it's esential that spacing be preserved precisely, then you get that by not using shell variables.