#70055 - 2002-09-20 09:28 PM
Processing/Filtering MS Proxy 2.0 logs.
|
BrianTX
Korg Regular
Registered: 2002-04-01
Posts: 895
|
I am currently evaluating software that will analyze proxy logs and return user and organizational data based on the information on each "hit" included in the logs. Unfortunately, most (if not all) products created for this purpose are extremely SLOW in filtering the data for one user. I understand that it takes time to do complete data analysis for organizational reports, but for user reports, the data should first be filtered, then analyzed.... Hence this little script I've written:
code:
break on CLS AT (0,21) "Proxy Log File User Detail Analysis" :fchoice AT (2,0) "Create filtered proxy log? " AT (3,0) "Please select Y or N --> " AT (3,25) GET $RK SELECT CASE LCASE($RK)="y" GOTO filter CASE LCASE($RK)="n" GOTO analysis CASE 1 GOTO fchoice ENDSELECT
:filter AT (2,0) "Filter log files by user id or ip address?" AT (3,0) "Please select U or I --> " AT (3,25) GET $RK SELECT CASE LCASE($RK)="u" $user=1 CASE LCASE($RK)="i" $user=0 CASE 1 GOTO filter ENDSELECT
AT(2,0) If $user "Please enter the user id (domain\username)." Else "Please enter the ip address. " Endif AT(3,0) "--> " AT(3,4) GETS $sfilter IF $sfilter = "" $sfilter = "@domain\@userid" ENDIF
AT(2,0) "Please enter source log path. " AT(3,0) "--> " AT(3,4) GETS $slogpath
AT(2,0) "Please enter destination log filename including path." AT(3,0) "--> " AT(3,4) GETS $logname IF $logname = "" $logname = "default" ENDIF IF RIGHT($logname,4) <> ".log" $logname = $logname + ".log" ENDIF
:Summary AT(2,0) "Summary: " AT(3,0) "$logname will be generated from logs in $slogpath" AT(4,0) "using $sfilter as the filter." AT(5,0) "Continue? Press 'y' to continue, 'n' to exit --> " AT(5,49) GET $rk SELECT CASE LCASE($RK)="y" GOTO generate CASE LCASE($RK)="n" GOTO end CASE 1 GOTO Summary ENDSELECT
:generate $lfile=DIR($slogpath+"\*.log") IF $lfile = "" or $slogpath = "" ? "No log files found." Goto End Endif While $lfile <> "" $lfiles=$lfiles+"|"+$lfile $lfile=DIR() Loop $lfiles=SPLIT(SUBSTR($lfiles,2),"|") AT(6,0) "Found " + (UBOUND($lfiles)+1) + " .log file(s)."
? "Started: @DATE @TIME" DEL $logname For each $lfile in $lfiles ? "Processing $lfile" If $user = 1 SHELL 'cmd /c findstr /I /C:", $sfilter," $slogpath\$lfile >> $logname' Else SHELL 'cmd /c findstr /I /B $sfilter $slogpath\$lfile >> $logname' Endif Next ? "Finished: @DATE @TIME" If EXIST ($logname) ? $logname + " Created." ?
:analysis
:end
I have left room at the bottom to add in my own analysis of the data (I can write sorting and filtering algorithms just as well as some commercial software!). I figured I will just post this now and update it later after I've added analysis. This part of the script simply runs a "findstr" to eliminate all but the desired user data... then I can use the fancy commercial analysis programs on the pre-filtered data... (until I write a better analysis algorithm.)
Obviously, I don't have categorization databases for this, but this is a minor issue.
Any comments?
Brian [ 20. September 2002, 21:29: Message edited by: BrianTX ]
|
Top
|
|
|
|
#70058 - 2002-09-22 05:08 AM
Re: Processing/Filtering MS Proxy 2.0 logs.
|
BrianTX
Korg Regular
Registered: 2002-04-01
Posts: 895
|
Well, I suppose I shouldn't be messing around with log files, except that each of our 4 (count 'em) proxy servers has 500+ users concurrently... and these are fairly OLD systems... I'm afraid logging via SQL will cause a slowdown for the clients -- I suppose I should test this theory before assuming, but essentially most internet usage analysis programs (the lower budget ones that still have website categorization) use the log files and NOT SQL databases.
Anyway, if someone would like to give me further advice on this, I'd love to listen. Currently, I'm evaluating WebSpy Analyzer Giga 1.0, Cyfin Reporter, Burstek LogAnalyzer, and NETIQ's WebTrends Firewall Suite... Of course each product has its deficiencies...
Brian
|
Top
|
|
|
|
#70059 - 2002-09-22 05:09 PM
Re: Processing/Filtering MS Proxy 2.0 logs.
|
Sealeopard
KiX Master
Registered: 2001-04-25
Posts: 11164
Loc: Boston, MA, USA
|
Brian: I am assuming that you do not require real-time analysis. If this is the case then you can import the log files into a database (Access, SQL, MSDE, MYSQL, PostGreSQL, or whatever other flavor) at night or any other slow time (maybe even copying the file beforehand). I t would make log analysis much easier if you can leverage sQL queries, however, it would require some SQL knowledge.
_________________________
There are two types of vessels, submarines and targets.
|
Top
|
|
|
|
#70060 - 2002-09-23 03:19 PM
Re: Processing/Filtering MS Proxy 2.0 logs.
|
BrianTX
Korg Regular
Registered: 2002-04-01
Posts: 895
|
Yes, I have thought of that (importing to SQL)as an option. However, the objective is not only to be able to do my own analysis, but to allow other programs to run their analysis on the data. Only one of the mentioned products can process the data from an SQL database. So, in order to use the other products, I must be able to at the very least "re-create" the log files. Perhaps I can do this by running an export function in SQL, but because I have only limited experience with SQL at this point, I am apprehensive about going in this direction. Instead, I should be able to tightly compress the .log files and extract them whenever they are needed. After examining this for a few weeks, I have determined that we are generating about 16 GB of log files per month. This can be compressed to about 2 GB of files per month which can be stored on about one 650MB CD per week.
Brian
|
Top
|
|
|
|
Moderator: Glenn Barnas, NTDOC, Arend_, Jochen, Radimus, Allen, ShaneEP, Ruud van Velsen, Mart
|
0 registered
and 507 anonymous users online.
|
|
|