new: ipv6loganon for anonymizing HTTP logs without loosing much information
Hi, triggered by some global discussions about privacy I've created a new program named "ipv6loganon", which is able to anonymize HTTP logs without loosing much information (e.g. the IPv6 address type - for further statistics). Some more information is below. Currently, it can not be used as pipe for apache HTTP server, reason is unknown, following config doesn't unfortunately not work (nothing will be logged anymore): CustomLog "| /usr/local/bin/ipv6loganon-static |/usr/sbin/cronolog /path/to/logs/server-log.%Y%m" combined Perhaps one can point me to a solution - thank you. Also some code improvements triggered by splint were done. Here pls. one can help me to splint also the md5 subdirectory, currently splint don't like the code there. Please run some tests. I plan to release package 0.70.0 in one week or so. Regards, Peter $ cat README $Id: README,v 1.2 2007/02/01 14:44:21 peter Exp $ ipv6loganon is a HTTP server log file anonymizer It expect a log line on stdin with an IPv4/IPv6 address as first token. This token would be anonymized according to given/default options. The anonymizer would keep as much information as possible for IPv6 address types. Client-side IID would be anonymized by - EUI-48 based: serial number would be zero'ed, keeping OID - EUI-64 based: serial number would be zero'ed, keeping OID - ISATAP: client IPv4 address would be anonymized by given IPv4 mask - TEREDO: client IPv4 address would be anonymized by given IPv4 mask client port would be zero'ed - 6to4(Microsoft): client IPv4 address would be anonymized by given IPv4 mask - local: whole IID would be zero'ed Client-side SLA would be anonymized by - SLA would be zero'ed Prefix would be anonymized by - 6to4: client IPv4 address would be anonymized by given IPv4 mask Compat/Mapped IPv4 addresses would be anonymized by - IPv4 address would be anonymized by given IPv4 mask Afterwards, the modified address and the trailing line would be printed to stdout. Example: Original lines (stdin): 207.46.98.53 - - [01/Jan/2007:00:01:15 +0100] "GET /Linux+IPv6-HOWTO/x1112.html HTTP/1.0" 200 6162 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 253 6334 2002:52b6:6b01:1:216:17ff:fe01:2345 - - [10/Jan/2007:15:04:28 +0100] "GET /favicon.ico HTTP/1.1" 200 4710 "http://www.bieringer.de/linux/IPv6/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.9) Gecko/20061219 Fedora/1.5.0.9-1.fc6 Firefox/1.5.0.9 pango-text" 413 5005 Modified lines (stdout): 207.46.98.0 - - [01/Jan/2007:00:01:15 +0100] "GET /Linux+IPv6-HOWTO/x1112.html HTTP/1.0" 200 6162 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 253 6334 2002:52b6:6b00:0:216:17ff:fe00:0 - - [10/Jan/2007:15:04:28 +0100] "GET /favicon.ico HTTP/1.1" 200 4710 "http://www.bieringer.de/linux/IPv6/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.9) Gecko/20061219 Fedora/1.5.0.9-1.fc6 Firefox/1.5.0.9 pango-text" 413 5005 -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto:pb@bieringer.de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ OpenBC http://www.openbc.com/hp/Peter_Bieringer/ Personal invitation to OpenBC http://www.openbc.com/go/invita/3889
On Thu, Feb 01, 2007 at 03:50:13PM +0100, Peter Bieringer wrote:
triggered by some global discussions about privacy I've created a new program named "ipv6loganon", which is able to anonymize HTTP logs without loosing much information (e.g. the IPv6 address type - for further statistics).
Please run some tests. I plan to release package 0.70.0 in one week or so.
Hi, ipv6loganon looks nice, thanks. Here's a small patch that fixes a couple of typos and adds a man page for ipv6loganon, based on --help output and the README. The --enable-system-geoip stuff seems to work OK for me. It's great to have, now I don't need to change the build system at all for the Debian packages. I still plan to hack the build system so that it could use an external copy of the ip2location and geoip source trees, I just haven't gotten around to doing it yet. With it there would be no need of shipping those in the ipv6calc tarball, which would be a good thing IMO. Cheers, -- Niko Tyni ntyni@iki.fi
Hi Niko, At 02.02.2007 09:40, Niko Tyni wrote:
ipv6loganon looks nice, thanks.
Here's a small patch that fixes a couple of typos and adds a man page for ipv6loganon, based on --help output and the README.
Thank you very much for your patches and contribution, applied to CVS now. I have also moved the anonymization stuff to related libraries, because I have plans for adding a new action to ipv6calc to anonymize a single IPv4/IPv6 address for external usage (e.g. for anonymizing logs which are harder to parse in C but simple to parse in Perl, like eg. postfix logs).
I still plan to hack the build system so that it could use an external copy of the ip2location and geoip source trees, I just haven't gotten around to doing it yet. With it there would be no need of shipping those in the ipv6calc tarball, which would be a good thing IMO.
For sure, would be great if you find time to do the work. Regards, Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto:pb@bieringer.de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ OpenBC http://www.openbc.com/hp/Peter_Bieringer/ Personal invitation to OpenBC http://www.openbc.com/go/invita/3889
On Mon, Feb 05, 2007 at 05:42:18PM +0100, Peter Bieringer wrote:
At 02.02.2007 09:40, Niko Tyni wrote:
I still plan to hack the build system so that it could use an external copy of the ip2location and geoip source trees, I just haven't gotten around to doing it yet. With it there would be no need of shipping those in the ipv6calc tarball, which would be a good thing IMO.
For sure, would be great if you find time to do the work.
OK, here we go. This patch adds the following configure options: --with-ip2location-headers=DIR --with-ip2location-lib=DIR --with-ip2location-static --with-geoip-headers=DIR --with-geoip-lib=DIR --with-geoip-static Furthermore, it renames (for consistency, YMMV) --enable-geoip-default-file => --with-geoip-default-file and removes the now unnecessary --enable-geoip-system There's also some documentation, including instructions on building IP2Location (README.BUILDING-IP2LOCATION), as I run into some problems with it. The removal of the libraries is not in the patch, you'll have to remove databases/IP2Location and databases/GeoIP yourself. I did modify the LICENSE file to anticipate this, though :) You might want to consider distributing a modified and bootstrapped version of the IP2Location library tarball separately on your site to make it easier for users. However, the license of the bigDigits stuff included [1,2] may pose a problem - as I read it, there's no permission to distribute the bigDigits source code at all, and the combination is also undistributable because it's linking GPL code with GPL-incompatible code [3]. [1] eg. libIP2Location/bigd.c [2] http://www.di-mgt.com.au/bigdigitsCopyright.txt [3] http://www.gnu.org/licenses/gpl-faq.html#WhatDoesCompatMean Cheers, -- Niko Tyni ntyni@iki.fi
Hi Niko, At 09.02.2007 21:48, Niko Tyni wrote:
On Mon, Feb 05, 2007 at 05:42:18PM +0100, Peter Bieringer wrote:
At 02.02.2007 09:40, Niko Tyni wrote:
I still plan to hack the build system so that it could use an external copy of the ip2location and geoip source trees, I just haven't gotten around to doing it yet. With it there would be no need of shipping those in the ipv6calc tarball, which would be a good thing IMO. For sure, would be great if you find time to do the work.
OK, here we go. This patch adds the following configure options:
--with-ip2location-headers=DIR --with-ip2location-lib=DIR --with-ip2location-static
--with-geoip-headers=DIR --with-geoip-lib=DIR --with-geoip-static
Furthermore, it renames (for consistency, YMMV)
--enable-geoip-default-file => --with-geoip-default-file
and removes the now unnecessary
--enable-geoip-system
There's also some documentation, including instructions on building IP2Location (README.BUILDING-IP2LOCATION), as I run into some problems with it.
Thank you very much for the patch!
The removal of the libraries is not in the patch, you'll have to remove databases/IP2Location and databases/GeoIP yourself.
Done in CVS now.
I did modify the LICENSE file to anticipate this, though :)
Great!
You might want to consider distributing a modified and bootstrapped version of the IP2Location library tarball separately on your site to make it easier for users. However, the license of the bigDigits stuff included [1,2] may pose a problem - as I read it, there's no permission to distribute the bigDigits source code at all, and the combination is also undistributable because it's linking GPL code with GPL-incompatible code [3].
[1] eg. libIP2Location/bigd.c [2] http://www.di-mgt.com.au/bigdigitsCopyright.txt [3] http://www.gnu.org/licenses/gpl-faq.html#WhatDoesCompatMean
Probably IP2Location people didn't recognize this...because they distribute this code along with their GPL LICENSE file...I will notify them asap. Regards, Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto:pb@bieringer.de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ Xing/OpenBC http://www.xing.com/hp/Peter_Bieringer/
Hi again, At 09.02.2007 21:48, Niko Tyni wrote:
On Mon, Feb 05, 2007 at 05:42:18PM +0100, Peter Bieringer wrote:
At 02.02.2007 09:40, Niko Tyni wrote:
I still plan to hack the build system so that it could use an external copy of the ip2location and geoip source trees, I just haven't gotten around to doing it yet. With it there would be no need of shipping those in the ipv6calc tarball, which would be a good thing IMO. For sure, would be great if you find time to do the work.
OK, here we go. This patch adds the following configure options:
--with-ip2location-headers=DIR --with-ip2location-lib=DIR --with-ip2location-static
--with-geoip-headers=DIR --with-geoip-lib=DIR --with-geoip-static
Furthermore, it renames (for consistency, YMMV)
--enable-geoip-default-file => --with-geoip-default-file
and removes the now unnecessary
--enable-geoip-system
I had to make also changes to ipv6calc/Makefile.in according to your changes, working now. Also I fixed the spec file, where the with-* options didn't work at all.
There's also some documentation, including instructions on building IP2Location (README.BUILDING-IP2LOCATION), as I run into some problems with it.
The removal of the libraries is not in the patch, you'll have to remove databases/IP2Location and databases/GeoIP yourself. I did modify the LICENSE file to anticipate this, though :)
You might want to consider distributing a modified and bootstrapped version of the IP2Location library tarball separately on your site to make it easier for users.
After short work, I have created a spec file (attached) with all your changes. @others: please do some build/run tests to check, whether the new introduced or changed options are proper working. Thank you, Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto:pb@bieringer.de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ Xing/OpenBC http://www.xing.com/hp/Peter_Bieringer/ %define IP2Location_ver 2.1.1 Summary: IP2Location C API Name: IP2Location Version: %{IP2Location_ver} Release: 1 Source: http://www.ip2location.com/download/C-IP2Location-%{IP2Location_ver}.tar.gz Buildroot: %{_tmppath}/%{name}-%{version}-root License: GPL Group: Development/Libraries Packager: Peter Bieringer <pb@bieringer.de> Vendor: IP2Location <http://www.ip2location.com> BuildPreReq: automake autoconf libtool %description IP2Location is a C library that enables the user to find the country that any IP address or hostname originates from. %package -n IP2Location-devel Version: %{IP2Location_ver} Summary: Development headers and libraries for GeoIP Group: Development/Libraries %description -n IP2Location-devel Development headers and static libraries for building IP2Location-based applications %prep %setup -n C-%{name}-%{IP2Location_ver} ## BUGFIXES BEGIN # Fix DOS-style endings in configure.ac perl -pi.bak -e 's/\r//' configure.ac ## Patch libIP2Location/Makefile.am for proper storage perl -pi.bak -e 's/pkglib_LTLIBRARIES = libIP2Location.la/lib_LTLIBRARIES = libIP2Location.la\ninclude_HEADERS = IP2Location.h bigd.h/' libIP2Location/Makefile.am ## BUGFIXES END autoreconf -i -v --force ./configure --prefix=%{_prefix} --libdir=%{_libdir} %build make %install rm -rf $RPM_BUILD_ROOT make DESTDIR=$RPM_BUILD_ROOT install # Install demo databases mkdir -p $RPM_BUILD_ROOT%{_datadir}/%{name} cp data/* $RPM_BUILD_ROOT%{_datadir}/%{name} %files -n IP2Location %doc COPYING NEWS README AUTHORS INSTALL %{_libdir}/libIP2Location.so %{_datadir}/* %files -n IP2Location-devel %{_libdir}/libIP2Location.a %{_libdir}/libIP2Location.la %{_includedir}/* %changelog * Wed Feb 14 2006 Peter Bieringer <pb@bieringer.de> 2.1.1-1 - Initial creation (some hints from Niko Tyni)
On Wed, Feb 14, 2007 at 11:31:17PM +0100, Peter Bieringer wrote:
I had to make also changes to ipv6calc/Makefile.in according to your changes, working now.
Hm, it certainly was included in the patch I sent. I wonder what went wrong. Anyhow, the version in CVS looks good to me. I used GETOBJS instead of LDFLAGS, but I'm sure both will work :)
Also I fixed the spec file, where the with-* options didn't work at all.
Sorry about that. Forgot it completely, as I'm not using it myself.
There's also some documentation, including instructions on building IP2Location (README.BUILDING-IP2LOCATION), as I run into some problems with it.
This file seems to be missing from the CVS repository, is that intentional? Cheers, -- Niko Tyni ntyni@iki.fi
At 15.02.2007 09:23, Niko Tyni wrote:
On Wed, Feb 14, 2007 at 11:31:17PM +0100, Peter Bieringer wrote:
I had to make also changes to ipv6calc/Makefile.in according to your changes, working now.
Hm, it certainly was included in the patch I sent. I wonder what went wrong.
Looks like something was going wrong during applying your patch. Meanwhile I found Makefile.in.rej in directory above...
Anyhow, the version in CVS looks good to me. I used GETOBJS instead of LDFLAGS, but I'm sure both will work :)
Looks like.
There's also some documentation, including instructions on building IP2Location (README.BUILDING-IP2LOCATION), as I run into some problems with it.
This file seems to be missing from the CVS repository, is that intentional?
No, forgotten. Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto:pb@bieringer.de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ OpenBC http://www.openbc.com/hp/Peter_Bieringer/ Personal invitation to OpenBC http://www.openbc.com/go/invita/3889
participants (2)
-
Niko Tyni
-
Peter Bieringer