<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Feedback and Boredom Result in 35% Performance Boost for Loghetti</title>
	<atom:link href="http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/</link>
	<description>Made with only the finest 1's and 0's</description>
	<lastBuildDate>Wed, 10 Mar 2010 20:02:17 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Kent Johnson</title>
		<link>http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/comment-page-1/#comment-193</link>
		<dc:creator>Kent Johnson</dc:creator>
		<pubDate>Mon, 17 Mar 2008 02:02:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/#comment-193</guid>
		<description>A smarter regex makes a huge improvement. The current regex has to do a lot of backtracking because of all the .* matches. Try this one:
re.compile(r&#039;(\d+\.\d+\.\d+\.\d+) ([^ ]*) ([^ ]*) \[([^ ]*) [^\]]*\] &quot;([^&quot;]*)&quot; (\d+) ([^ ]*) &quot;([^&quot;]*)&quot; &quot;([^&quot;]*)&quot;&#039;)

I have a few smaller improvements, too. Drop me an email and I will send them to you, it is a bit much for a comment.</description>
		<content:encoded><![CDATA[<p>A smarter regex makes a huge improvement. The current regex has to do a lot of backtracking because of all the .* matches. Try this one:<br />
re.compile(r&#8217;(\d+\.\d+\.\d+\.\d+) ([^ ]*) ([^ ]*) \[([^ ]*) [^\]]*\] &#8220;([^"]*)&#8221; (\d+) ([^ ]*) &#8220;([^"]*)&#8221; &#8220;([^"]*)&#8221;&#8216;)</p>
<p>I have a few smaller improvements, too. Drop me an email and I will send them to you, it is a bit much for a comment.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: m0j0</title>
		<link>http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/comment-page-1/#comment-197</link>
		<dc:creator>m0j0</dc:creator>
		<pubDate>Sat, 15 Mar 2008 14:17:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/#comment-197</guid>
		<description>Doug -- Thanks - that&#039;s a great article, as it talks about these optimizations in the context of *exactly* what I&#039;m doing. Thanks again!

Kent -- I haven&#039;t posted the revised code. I was hoping that the apachelogs author would make his project public so I could just pass on improvements and have them accepted or not instead of maintaining a separate version distributed with loghetti. In the meantime, I was hoping to pull most of my custom code *out* of that module so that there aren&#039;t duplicate modules flying around. I&#039;ll just put it somewhere in the Filter class, which is what I started doing to implement the &#039;lazy&#039; features.

Thanks for the tips above - they mirror some thoughts I&#039;m having after reading the article linked by Doug. There&#039;s lots of great stuff in there. Gimme another week, and I&#039;ll post the revised code.</description>
		<content:encoded><![CDATA[<p>Doug &#8212; Thanks &#8211; that&#8217;s a great article, as it talks about these optimizations in the context of *exactly* what I&#8217;m doing. Thanks again!</p>
<p>Kent &#8212; I haven&#8217;t posted the revised code. I was hoping that the apachelogs author would make his project public so I could just pass on improvements and have them accepted or not instead of maintaining a separate version distributed with loghetti. In the meantime, I was hoping to pull most of my custom code *out* of that module so that there aren&#8217;t duplicate modules flying around. I&#8217;ll just put it somewhere in the Filter class, which is what I started doing to implement the &#8216;lazy&#8217; features.</p>
<p>Thanks for the tips above &#8211; they mirror some thoughts I&#8217;m having after reading the article linked by Doug. There&#8217;s lots of great stuff in there. Gimme another week, and I&#8217;ll post the revised code.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kent Johnson</title>
		<link>http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/comment-page-1/#comment-196</link>
		<dc:creator>Kent Johnson</dc:creator>
		<pubDate>Sat, 15 Mar 2008 13:28:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/#comment-196</guid>
		<description>Did you post the revised code?

In _ApacheLogFileGenerator.Generator:
- hoist the lookup of self.r.match out of the loop
- Create the ApacheLogLine with
  log_line = ApacheLogLine(*m.groups())
to cut out a bunch of method calls.

BTW you could simplify this code by making Generator be ApacheLogFile.__iter__; the nested class and delegation from __iter__ are not needed.

BTW for your grep test did you use the same regex or did you just grep 404?</description>
		<content:encoded><![CDATA[<p>Did you post the revised code?</p>
<p>In _ApacheLogFileGenerator.Generator:<br />
- hoist the lookup of self.r.match out of the loop<br />
- Create the ApacheLogLine with<br />
  log_line = ApacheLogLine(*m.groups())<br />
to cut out a bunch of method calls.</p>
<p>BTW you could simplify this code by making Generator be ApacheLogFile.__iter__; the nested class and delegation from __iter__ are not needed.</p>
<p>BTW for your grep test did you use the same regex or did you just grep 404?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Doug Hellmann</title>
		<link>http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/comment-page-1/#comment-195</link>
		<dc:creator>Doug Hellmann</dc:creator>
		<pubDate>Sat, 15 Mar 2008 11:57:08 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/#comment-195</guid>
		<description>This article should offer some other tips for boosting performance: http://effbot.org/zone/wide-finder.htm</description>
		<content:encoded><![CDATA[<p>This article should offer some other tips for boosting performance: <a href="http://effbot.org/zone/wide-finder.htm" rel="nofollow">http://effbot.org/zone/wide-finder.htm</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Toneby</title>
		<link>http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/comment-page-1/#comment-194</link>
		<dc:creator>Toneby</dc:creator>
		<pubDate>Sat, 15 Mar 2008 07:47:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/03/14/feedback-and-boredom-result-in-35-performance-boost-for-loghetti/#comment-194</guid>
		<description>&quot;grep &#124; wc -l&quot; isn&#039;t the optimal way, use &quot;grep -c&quot; instead and you should get even shorter times since there are no extra processes and pipes involved. Great you are getting you log checker more optimized, I know it sucks waiting around to get the results back from investigation large logfiles. I fairly often have to do it with multi gigabyte files.</description>
		<content:encoded><![CDATA[<p>&#8220;grep | wc -l&#8221; isn&#8217;t the optimal way, use &#8220;grep -c&#8221; instead and you should get even shorter times since there are no extra processes and pipes involved. Great you are getting you log checker more optimized, I know it sucks waiting around to get the results back from investigation large logfiles. I fairly often have to do it with multi gigabyte files.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
