2012-06-11

Hostile IP monitor using Twisted Python

Just finished my first trip into the wonderland of Twisted Python …



I wrote a small application to monitor Hostile IPs in a set of target BGP Autonomous Systems.


What it does in a nutshell

Blkmon monitors a set of target BGP AS for Hostile IPs listed in publicly available Blocklists.

The contents of Public blocklists (containing lists of Hostile IPs) are downloaded and parsed. Hostnames are resolved using an efficient bulk Dns lookup.

A public routeserver is used to list the subnets for the target ASNs. This data is loaded into a Balanced Binary Tree for quick searching to determine whether a given Hostile IP is in the target ASNs.

The resulting (hopefully short) list of potential Hostile IPs is then run through a bulk whois lookup. This is used to cross-validate IP-ASN mapping.

Now comes the good stuff. Drum roll .....


Not only are the results presented in a simple web GUI, the status alert messages are also sent out using Xmpp chat (eg Gtalk) to a list of uids!

Neat? you bet!

Under the covers

The application contains python Twisted code for telnet, simple TCP, web gui, and dns lookups. Twisted Wokkel is used for xmpp.

I won’t bother to explain the application’s processing in detail. You can find all this and more at the application's web site: http://code.google.com/p/blkmon/



Twisted python


So what is Twisted python anyway?

The Twisted Labs web site (http://twistedmatrix.com/trac/) says it this way:
Twisted is an event-driven networking engine written in Python and licensed under the open source  MIT license.


What’s good

The Twisted programming framework supports a wide variety of protocols from web through telnet / ssh / ftp to mail / pop / imap to xmpp chat.

Code in Twisted tends to be extremely compact.

Twisted is blazingly fast when properly coded.

And Twisted scales to enterprise size major e-commerce applications.

And what’s not so good

Twisted is a whole new paradigm for developers used to doing ordinary synchronous coding. It is based on an asynchronous, event-driven model.  Callback functions are fired at event completion to drive follow-on processing as well as error handling.

Like it or not, Twisted is complex.

Worse, documentation is not that good. I depended a lot on Google searches for code snippets to do what I needed to do.

Getting started

Beside the Twisted Lab’s documentation and sample code, there are two sources of help to get up to speed on Twisted.

Go sit at the feet of Mr Krondo: http://krondo.com/?page_id=1327 His Twisted tutorial is one of the best around.

And try to get your hands on a copy of O’Reilly’s Twisted Network Programming Essentials, Abe Fettig, October 20, 2005 Print ISBN-13: 978-0-596-10032-2, 238 pg.

Gotchas


What kind of problems (ahem “challenges”) did I run into and had to resolve?
  • Obscure error msgs: Google is your friend as always

  • Twisted was just too fast:  which meant that I added some sort of throttling mechanism in various places to slow things down

  • Gtalk didn’t want to keep the xmpp session open: Ended up doing logon / logoff every time instead of trying to manage a single xmpp session.

  • Server responses seemed to get stuck in the telnet buffers: Wrote a slow asynchronous loop to hit Enter from time to time to flush things out.

  • Line endings for ubuntu linux were not what Twisted code was looking for: Moved to data mode and remodelled the defective piece of code.

  • Dns lookups swamped the server PC: After implementing a “poor man’s” throttling mechanism, the app is still doing 17K lookups in about 30-35 min on a very slow tiny PC.

  • The different asynchronous processes had to communicate with one another, access the data structures, and use the worker fns, even when buried deep in a callback chain: A global data container object was implemented. This meant that only one ptr was passed around instead of a whole bunch of ptr. Also helped keep things synchronized.

  • Sometimes things are just easier if coded in a pseudo-synchronous manner: The decorator “@defer.inlineCallbacks” converts ordinary loops into a series of Twisted defers.

  • Whois lookups were too slow: Found a Balanced Binary Tree algorithm at dzone and used it to store all the subnets for all the tgt ASNs. Then each hostile IP was looked up in the Binary Tree. Speeded things up by an order of magnitude.

  • The binary tree algorithm did arithmetic comparisons: Google’s Ipaddr module was used to overload the arithmetic operations with equivalent subnet comparison / membership operations.

  • Looking at whether a Hostile Ip was in one of the AS subnets sometimes was not completely accurate: A bulk whois lookup was done on the set of candidate IPs to cross-validate the ASN – IP relationship.

  • How to prevent the web server interface from being attacked: The input is truncated to the smallest workable size. Next, a regex expression filters input. Finally code is enclosed in a “try” block to catch exceptions. Could have escaped output as well but didn’t (ahem!) think this was necessary.
Well that’s about it. Details are over at Google code (see the link above).

Enjoy!

2 comments:

write my research paper said...

thanks for sharing this detailed and instructional article about Hostile IP monitor. Keep up sharing similar articles. I will definitely bookmark your blog for future assistance and to gain technical expertise.

Blogger said...

BlueHost is one of the best website hosting company with plans for all of your hosting requirements.