ARTICLE

Getting Web Statistics for Instant Downloads in ASP.NET

Posted by sshlosman Articles | Visual Basic 2010 November 09, 2005
In this article we will provide simple ASP.NET application which can walk through the web server activity logs, parse them on a fly and finally display the summary statistic report for each fixed time interval (day, month, year) chronologically.
Download Files:
 
Reader Level:

Introduction.

One of the most important website activity parameters is the resource access statistic. Such information is necessary for many purposes - optimizing of the website content, marketing campaigns improvements and also for some diagnostic tests. The detailed information regarding resource access statistic saved by the web server into the log file(s).

There are lots of applications and program tools such as "WebTrends Log Analyser" (by http://www.webtrends.com) which can parse the web server activity logs, compose the statistical information and finally display this information in user-friendly format. Majority of these programs can provide the information with resource access statistic during some fixed time interval. Also such report generators require some time to process the log files and prepare the statistic reports.

In this article we will provide simple ASP.NET application which can walk through the web server activity logs, parse them on a fly and finally display the summary statistic report for each fixed time interval (day, month, year) chronologically.

Log File Parsing.

We need to provide access to the web server activity log files in order to allow the ASP.NET application parse them. For demo purposes we will assume that our test web server configured to save all log files to the same PC where our ASP.NET application runs. All what we need is to read the log files in an appropriate order, parse each of them and finally enumerate all occurrences of the given key phrase, lexeme or a resource name.

We also will assume that the current web server stores its log files daily and names them using the following file mask: "exYYYYMMDD.log". Where YYYY denotes the year part of the log file creation date, MM - month and DD - day correspondingly. This will allow us not to parse each log file for the extracting of the log file creation date.

Finally, the algorithm of iterating through the log files and finding all occurrences of the specified phrase is shown below:

Private Function ProcessFile(ByVal fileName As String, ByVal checkWord As String) As Integer
Dim
wordCount As Integer = 0
Dim fs As FileStream = New FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
Dim sr As StreamReader = New StreamReader(fs)
Dim s As String
s = sr.ReadLine()
While (s) <> Nothing
If s.ToUpper().IndexOf(checkWord.ToUpper()) > -1 Then
wordCount += 1
End If
Loop
sr.Close()
fs.Close()
Return wordCount
End Function

Private Function ProcessFilesByDate(ByVal checkWord As String, ByVal startDate As DateTime, ByVal endDate As DateTime) As Integer
Dim totalWordCount As Integer = 0
Dim dt As DateTime = startDate
Do While dt <= endDate
Dim file0 As String = String.Format(LogNameFormat, dt.ToString(LogNameDateFormat))
file0 =
String.Format("{0}\{1}", LogPath, file0)
If File.Exists(file0) Then
Dim wordCount As Integer = ProcessFile(file0, checkWord, True)
totalWordCount += wordCount
AddLogFileWordCount(dt.ToString("dd MMM yyyy"), wordCount)
End If
dt = dt.AddDays(1)
Loop
Return totalWordCount
End Function

Displaying the statistic information on the web page.

The resource access statistic information can be displayed chronologically for each time interval. Such representation is helpful when you want to know the download statistics of the specified resource per each time interval (e.g, daily). The code below represents the modified version of the file enumerating algorithm from the previous chapter:

Protected tblLogFileWordCount As System.Web.UI.WebControls.Table
Private Sub PrintLogFileWordCount(ByVal file As String, ByVal wordCount As Integer)
Dim row As TableRow = New TableRow()
tblLogFileWordCount.Rows.Add(row)
Dim cell As TableCell = New TableCell()
row.Cells.Add(cell)
cell.Width = Unit.Percentage(20)
cell.Text =
String.Format("{0}:", Path.GetFileName(file))
cell =
New TableCell()
row.Cells.Add(cell)
cell.Width = Unit.Percentage(80)
cell.Text = wordCount.ToString()
End Sub

Multithreaded downloading statistic.

Many users have special programs for downloading large files more effectively. Such programs (Download Managers) usually download one single web resource in multiple downloading threads simultaneously. Web server stores the corresponding log record per each downloading thread. In order to prevent our log parser from enumerating such duplicated log records we need to extract the user IP from each log record and check it for matching with all previously extracted IPs:

Private ipList As Hashtable = New Hashtable()
Private Function IsNewIp(ByVal ipString As String) As Boolean
Dim
result As Boolean = Not ipList.Contains(ipString)
If result AndAlso (Not ipString.Equals(String.Empty)) Then
ipList.Add(ipString, ipString)
End If
Return
result
End Function
Private Function GetIp(ByVal line As String) As String
Dim
ind As Integer = line.IndexOf(" ")
If ind > -1 Then
ind = line.IndexOf(" ", ind + 1)
End If
If
ind > -1 Then
Dim
indEnd As Integer = line.IndexOf(" ", ind + 1)
If indEnd > -1 Then
Return
line.Substring(ind + 1, indEnd - ind - 1)
End If
End
If
Return
String.Empty
End Function

Private Function ProcessFile(ByVal fileName As String, ByVal checkWord As String) As Integer
Dim
wordCount As Integer = 0
Dim fs As FileStream = New FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
Dim sr As StreamReader = New StreamReader(fs)
Dim s As String
s = sr.ReadLine()
While (s) <> Nothing

If
s.ToUpper().IndexOf(checkWord.ToUpper()) > -1 Then
If
IsNewIp(GetIp(s)) Then
wordCount += 1
End If
End
If
Loop
sr.Close()
fs.Close()
Return wordCount
End Function

This code is constantly being refined and improved and your comments and suggestions are always welcome.

NOTE: THIS ARTICLE IS CONVERTED FROM C# TO VB.NET USING A CONVERSION TOOL. ORIGINAL ARTICLE CAN BE FOUND ON C# CORNER (WWW.C-SHARPCORNER.COM).

share this article :
post comment
 
6 Months Free & No Setup Fees ASP.NET Hosting!
Become a Sponsor
PREMIUM SPONSORS
  • Finally – a virtual platform that delivers next-generation Windows Server 2008 Hyper-V virtualization technology from a managed hosting partner you can truly depend on. Visit www.maximumasp.com/max for a FREE 30 day trial. Hurry offer ends soon. Climb aboard the MaxV platform and take advantage of High Availability, Intelligent Monitoring, Recurrent Backups, and Scalability – with no hassle or hidden fees. As a managed hosting partner focused solely on Microsoft technologies since 2000, MaximumASP is uniquely qualified to provide the superior support that our business is built on. Unparalleled expertise with Microsoft technologies lead to working directly with Microsoft as first to offer IIS 7 and SQL 2008 betas in a hosted environment; partnering in the Go Live Program for Hyper-V; and product co-launches built on WS 2008 with Hyper-V technology.
    The leading .NET charting control now features PDF, Flash and Silverlight export, visualization of large datasets and more. Deliver true charting functionality to your BI, Scorecard, Presentation or Scientific apps. Download evaluation now.
Team Foundation Server Hosting
Become a Sponsor