Log Mining Toolkit

 Contact Information

Tao Li, Advisor
School of Computing and Information Sciences, Florida International University
11200 SW 8th Street, Miami, 33199, USA
Email: taoli at cs.fiu.edu, URL: http://www.cs.fiu.edu/~taoli

Liang Tang, PhD student
School of Computing and Information Sciences, Florida International University
11200 SW 8th Street, Miami, 33199, USA
Email: ltang002 at cs.fiu.edu, URL: http://www.cs.fiu.edu/~ltang002

Introduction

Log Mining Toolkit is a solution for analyzing and mining massive texual log information. It first collects the log informtion from the system, and then uses data mining techniques to analyze the system log information. The toolkit can help people understand what happened inside a complex system and also facilitate the root-case analysis of malfunctions.

Our Log Mining Toolkit integrates serval related subprojects. The first subproject is the LogTree, which is introduced in the following section.

LogTree: A framework of generating system events from raw textual logs

Our first work is to convert the raw textual log messages into system events. Many raw log messages are generated by various components of a complex system, customizing a special log converter for one system is a time-consuming task. Furthermore, the incompleteness and inconsistency of documentation cause customizing log converters to become really difficult, if not impossible. In our solution, we use data clustering approach to divide the log messages into different groups using both the message terms and message format information. Each group represents one type of system events.

We developed a simple and basic demo by Java 1.5, which is a console application.

Screenshots:

The main window of the LogTree console appplication
The created event timeline
The created event histogram

Publications: 

Demo: 

logtree.jar ( This demo is not fully implemented, just for show the use cases. If you have any question, please email me).
This demo only contains four parsers for fileZilla log, mysql error log, apache error log and pvfs2 log. There are 4 testing log files for each of them: filezilla_1K.log, mysqlerr.txt, apache_error_1K.log, pvfs2_1K.log. Note when you open this log file in the demo, please specify what type of log file.

Source Code:  

 logtree_alpha0.1_src.zip.  This zip package includes two directories: lib and src. All the implementation source code is inside "src" directory. "lib" directory contains the neccessary libraries for this source code.  In this demo, we use JUNG and JFreeChart to visualize the generated events.

 

Locations of visitors to this page