Introduction
This is a perl script that analyzes the BitTorrent
tracker logfile. In the usual case, it is run continuously, eating the
tail of the logfile, generating statistics on the files tracked, peers,
bandwidth usage, etc.
As a side-effect, it can also create graphs as shown on the right, using
rrdtool as the
graphing engine.
It is released under the GNU General Public License (GPL), and therefore Free Software.
Downloads
| Type | Version | Link | Size | Notes | |
| Script | 1.18 09/06/03 | trackerlyze.pl | [highlighted] | 100kB | Current release |
| Script | 1.11 02/23/03 | trackerlyze.pl | [highlighted] | 48kB | Initial Release |
Source
The highlighted sourcecode is avaiable as well, of course (right next to the download).
Documentation
Requirements
- Perl (written with 5.6/5.8, anything after 5.005 should do)
- rrdtool for Graphing
- A BitTorrent Tracker logfile (or redirect of its output). Both bttrack.py and bnbt outputs have been tested.
- optionally rrdtool´s RRDs Perl module
- optionally mySQL for statistics storage
- older versions needed Date::Parse Module (CPAN)
Usage
This script won´t work out of the box; sensible defaults have been
given, but you still need to customize some things ... It´s a CLI
program (actually not really much of an interface to speak of); it should
run on any operating system that meets the requirements, but it has only
been tested on Linux, FreeBSD, and Win32.
The configuration is changed within the script (everything at the top).
Most things are commented though, so it should be possible to work it out.
First, you need to decide how you want the data presented to you. You have the choice of letting the program write all statistics into files (that´s with trackerlyze::IO::file) of the form HASH.statsname (where HASH is the info_hash of the torrent and statsname is the name of the statistic) or writing the data into SQL tables (that´s with trackerlyze::IO::sql); If you want to do it another way, just create another IO-module or combine them with trackerlyze::IO::multiplex. You can disable either of the two functions by using the ::Dummy module, as well.
All modules can be configured by giving parameters with their initialization; Use of this has been made in the given examples.
Files
To use this, uncomment (if commented out)
and change the settings the way you like them. Note that calculating and
writing out peerstats (statistics over which peers are on a specific
torrents, and data about each client´s activity) can be quite a
burden on the machine if you run a big tracker.
statusdir specifies the directory statistics will be written to. You
could also set miscsuffix, which specifies a suffix to be used for
miscellaneous (non-torrent specific) statistics (such as total connected
peers or the size of all scrapes thus far).
SQL
In order for this to work, you need a SQL-capable database. The script
has been written and tested with mySQL (3.x, 4.x), so if you use another,
you will have to change a few things.
Anyway, like with files, uncomment
and change the settings. Variables should be self-explanatory. Make sure the script can connect to your database, and the tables you want to used do not yet exist (or are set up already).
Other/Custom Outputs
If you want to use other means of output, just copy trackerlyze::IO::file
and work from there. It should be easy enough to implement such ;)
A provided module, trackerlyze::IO::multiplex provides the means to use
more than one module. If you want to try it out, uncomment
and play with it :)
Other settings
There are plenty of other settings at the top of the script; They are safe to leave alone, but you here are some you might have interest in changing :
- $display_file_stats sets whether the script should give periodic output of how the different .torrents are working. This might be neat if you look at the output of the program every now and then, but is useless (and CPU/memory consuming) otherwise.
- $verbose controls whether verbose status and error messages should be given. This is usually a good idea; better too much output than not enough.
- $always_do_filestats controls whether torrent statistics should be written even if the analyzed data is very old (i.e. you are running the script over a whole logfile instead of just tailing the output). This is mostly useful for testing purposes and will slow down the script.
- $statefilename specifies the name of the file the current state is saved to at a preset interval or when quitting the script; the script will pick up where it left off when you run it again (see below for examples).
- $always_do_peakstats slows down processing of huge logfiles a bit, but you will get "peak" statistics (seeders, leechers, speed) on each torrent thus far. A good idea to leave on, usually.
- $allow_peerstats slows the script down, and uses a bit of memory; it makes it possible to create "peerstats" (individual stats over peers on a specific torrent), though. If you have CPU/mem left over and like looking at how different torrents are doing, you can leave this on. If you aim for speed, turn it off.
Graphing
Last, but not least, there is the Graphing to take care of. There are two ways to invoke rrdtool -- through the perl module RRDs, which is faster, or by calling an external rrdtool program, which is slower. Otherwise, these ways don´t differ.
rrd specifies the filename used for saving the Graphing database,
outputdir the directory where graphs are written to. If you would
rather want to use an external graphing script instead of the provided
methods, specify useexternal (make sure it´s points to an
executable script). This is sometimes useful if you want to graph
for more than one tracker.
In order for graphing to work, you need to have rrdtool installed and
ready to use. If that´s not an option, specify
trackalyze::Graph::Dummy, and you should never hear from it again.
Running
Once all the settings have been given, you are just about ready to run the program. Usually, you just tail the output of the tracker; so if you had run run the tracker with
(i.e. the logfile tracker.log is appended with the output of bttrack.py), you could run the analyzer with
and it would start churning.
If you quit the analyzer (usually with CTRL-C), it will save its statefile
(as it does in periodic intervals, usually every 3 hours); if you want to
continue where you left off, you can just give the above command again and
it will not have lost any statistics (provided there weren´t more than
1000 log-lines; in that case, just increase the number in the call to tail).
This also means that you could process a whole backlog of logs and continue
where you are now this way :
If you want the script to just read the statefile and write out/graph according to it, just run the script with the --writeout parameter :
Signals
You can send the process signals to force an event to occur. Namely, they are
- HUP makes the client display the runtime state after processing the current line.
- USR1 triggers graphing after the current period (this is useful when processing a backlog with $always_graph set to 0)
- USR2 triggers writing out torrent statistics after the current period. Same usefulness as above.
Win32
The script should work fine on Win32 as well. You are going to need
at least a working Perl (I suggest using ActivePerl,
which is rather easily installed), and a working rrdtool installation,
best complete with RRDs Perl integration. If you use ActivePerl 5.8.0.x,
you can use rrdtool-1.0.40.x86distr.zip-5.8.zip
directly from the source (or a newer version in the same directory).
Have a look at the readme, especially the part about RRDs.
If you want to use the SQL backend, you will also need a working
mySQL installation and working
DBI/DBD::mysql Perl modules, probably also installed from ppm.
Once everything is installed, all that is left is running the script. If your logfile is named tracker.log, calling
should get you going. Verify that stats and graphs get written to the right places (starting with IO::File and RRDs is probably a good idea), and you got yourself a win32 trackerlyze installation.
Note that the signal handlers will not work on this OS; during startup, you might get two error message relating to USR1 and USR2. You can safely disregard them.
Using the data
Ok, the analyzer is running, what now ?
If you are using the SQL output, I guess you know how to access your
database and get the output you want from it :)
If you are using the file output, you could (for example) include
the statistics in your pages with Server Side Includes (SSI) like this :
If your file had the hash HASH (for example 21098d0b5458a71904eeb3a1eda2ad98190c0577).
I´m sure you´ll come up with some way to use that data ;)
Upgrading
Upgrading from 1.11 to 1.18 and retaining the state-file is possible; the script will start tracking the "new" statistics on the fly. Continueing the graph is not quite that easy -- if you still have the logs, I suggest you let them be analyzed all over again. If that´s not an option, you will have to find a way to add another data-source to the round-robin database (I suggest looking into rrdtool dump and rrdtool restore and go from there. It´s not going to be pretty, and is really NOT recommended.
If you are using the SQL backend, you will have to add a couple of fields
to the torrents-table; this should not be a problem if you have something
like phpMyAdmin to administrate
your database with.
The new fileds are
- peakclients INT unsigned
- peakseeds INT unsigned
- peakspeed FLOAT unsigned
It does not really matter where in the structure you insert them, the script does not use positional inserts or selects.
Changelog
- 1.11
- Initial release
- 1.18
- Speed improvement
- RRDs support (now preferred method)
- User Graphs (with Leechers/Seeders)
- More statistics (peaks, many general stats)
- Optimized SQL backend
- Script configuration now completely at the top
- More Signal Handlers, --writeout
- Runtime checks for DBI and RRDs (no more commenting them out)
- No longer needs Date::Parse
- Creates .png graphs by default now
- A couple more sanity checks on the data
- Lots of small bugfixes
- Win32, bnbt support
Closing & Support
That about wraps it up for now ... If you have suggestions (or, better yet, code :) to contribute, please drop me a line :)
On the off chance that you appreciate my work and would like to support
me, please consider sending me a buck or two ... It doesn´t have
to be much; I´m happy with anything that helps offset my
online-costs a little. Of course you will be mentioned here
(if so desired).