We were asked recently to provide a simple method to let Department of Computing researchers make software or datasets available for download, and then track who downloads that material.
This Download Tracking System (DTS) is what we've come up with to provide this facility, for anyone who'd like to use it.
For each user who downloads software via the DTS, we gather the following pieces of information:
- Name
- Institution
- Email address
THE BASIC IDEA
Without the Download Tracking System, you'd probably create a single web-accessible directory for the software, containing the downloadable software itself (usually as a tarball) along with a web page containing a description of the software, how to download, install and use the software. This web page would contain one or more "download this piece of wonderful software" links to the tarball in the same directory.
When you use the Download Tracking System, you still proceed in the same fashion, the only difference is that you replace the "download this software" link with a link into the Download Tracking System webpage naming the package (and hence the tarball).
When a user wants to download the software, they read your instructions page, click on the "download this wonderful software" link, and the DTS will ask them to provide their details (name, institution, email) before providing the link to the tarball in the original directory.
THE DETAILED PROCEDURE
Let us consider a simple example. We're about to write the most wonderful program in the world (ok, ok, it's the good old "hello world" example), then make it available to the world.
STAGE 1: CREATING THE SOFTWARE
In some suitable non-web accessible source directory, let's create a directory and populate it with the desired materials (we'll give this as a series of Linux commands, Windows users should ssh into a shell server, shell[1-2-3-4].doc.ic.ac.uk, and work there):
- Create the directory, enter it, create the first file:
mkdir -p ~/src/hello/hello-1.0 cd !$
-
Now create the contents of hello.c (using your favourite editor, vi, emacs, nano, gedit) as follows:
#include <stdio.h> #include <stdlib.h> int main( void ) { printf( "hello world\n" ); exit(0); }
- Now create a Makefile containing the following:
CC = gcc CFLAGS = -Wall all: hello clean: /bin/rm -f *.o core hello hello: hello.o
-
Add other useful files, such as a README describing what the software does, eg:
hello.c: the most wonderful piece of original software ever. it's great.
-
Plus an INSTALL file saying how to compile and install it:
compile with: make run with: ./hello
-
Check that the software builds (make, ./hello) and cleans neatly (make clean).
STAGE 2: PUBLISHING THE SOFTWARE VIA THE DTS
-
Choose a name for the software, eg hello, and decide where in web-accessible space to store this directory (eg ~dcw/public_html/hello).
- Start in the source directory for this software:
cd ~/src/hello/hello-1.0
- Build a tar ball called "hello-download.tgz" which contains your software - the whole hello-1.0 directory after cleaning (it's good practice to hand out tarballs that create version numbered directories when extracted):
make clean cd .. tar czvf hello-download.tgz hello-1.0
- check the tarball is good:
tar tzvf hello-download.tgz
- copy the tarball into the web accessible location:
mkdir ~dcw/public_html/hello cp hello-download.tgz !$
- Create an index.html web page (perhaps based on your README) that describes the software and contains a download link URL of the form:
<pre> <a href="http://www.doc.ic.ac.uk/download/?package=hello">hello-download.tgz</a> </pre>
- index.html for hello might say:
<html> <head> <title>hello - the software you've all been waiting for</title> </head> <body> <h1>hello - the most wonderful piece of original software ever.</h1> <p> Download the hello tarball via our download form at: <pre> <a href="http://www.doc.ic.ac.uk/download/?package=hello">Download the Hello Tarball</a> </pre> <p> Extract the tar ball, then read the INSTALL file for building and packaging instructions. <hr> </body></html>
-
To recap, the directory ~/public_html/hello now contains index.html and hello-download.tgz, and index.html links to the above download link.
- Now, if a user points the web browser at
http://www.doc.ic.ac.uk/~dcw/hello/
- they will see the index page:
hello - the most wonderful piece of original software ever. Download the hello tarball via our download form at: Download the Hello Tarball Extract the tar ball, then read the INSTALL file for building and packaging instructions.
- and if they then click on "Download the Hello Tarball" they'll see the download form:
Welcome to the Imperial College Department of Computing download system Thanks for expressing an interest in downloading hello. We would like you to tell us who you are, where you're from, and how we can contact you before we let you download hello:-) Name: ______________ Institution: ______________ Email: ______________ Submit Clear
- when they fill that form in and click submit, they get a link to click on that will actually download the tarball:
Thanks for expressing an interest in downloading hello. Thank you for filling in your details. They are: Name: dunc the hunk Institution: hunkland Email: dunc@ hunkland.com To download hello, please click on the link below: http://www.doc.ic.ac.uk/~dcw/hello/hello-download.tgz
- Assuming they bother to download the software, they should then be able to follow the instructions in the web page.
STAGE 3: CHECKING THE LOGS
- When you want to check the logs to find out who has downloaded your software:
less /vol/www/doc/download_log/hello.log
- This file contains CSV entries of the form:
2012/02/01 18:50:53,dunc the hunk,hunkland,dunc@ hunkland.com,http://www.doc.ic.ac.uk/~dcw/hello/hello-download.tgz
-
Note that there is a separate log file per package, so really "package names" should be unique within DoC (to avoid your "hello" downloads getting mixed up with someone else's "hello" downloads). The simplest thing is to check the above log file does not already exist when about to use a package name, or to avoid clashes by using a package name of the form username-softwarename, or groupname-softwarename for research group software.
-
If two package names did clash, note that the log entries contain the URL of the tarball, so one could still disentangle the downloads from~dcw/hello from someone else's downloads with some simple grep'ping. But that would be a pain, better to avoid it.
-
Please let us know (contact the CSG Helpdesk) if you have any difficulties getting the DTS to work, or even if you want to congratulate us..