<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: A Couple of MySQL Performance Tips</title>
	<atom:link href="http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/</link>
	<description>Made with only the finest 1's and 0's</description>
	<lastBuildDate>Thu, 26 Jan 2012 21:20:45 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: IT_Architect</title>
		<link>http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/comment-page-1/#comment-9149</link>
		<dc:creator>IT_Architect</dc:creator>
		<pubDate>Wed, 18 Feb 2009 18:48:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/#comment-9149</guid>
		<description>I didn&#039;t think I&#039;d ever see the day.  I&#039;ve been working as a DB consultant for a lot of years, and it has been anathema to even suggest denormalization.  The only thing I would add is, know thy database.  Some a very good at handling normalization, but if they are free or inexpensive, they are limited somehow.  You&#039;ll need to spend real money for real performance with lots of data.  Nobody WANTS to denormalize, but I&#039;ve been doing it for years when the DB engine required to do the job was not in the cards.  These are the killers for cheap or free databases:
1.  Joins are more than 1 deep.  E.G. Grandfather - Father - Son would be 2 deep.  Even Father - Son, Son slows some of them.
2.  Any time the ORDER BY is composed of columns of more than one table.
3.  Ad-hoc joins such as from QBE tools.  Helpful indexes are available, but they won&#039;t be used to resolve the query even though they are available because the DB engine doesn&#039;t have the smarts to use them.
*BTW, you can also spend a ton, and get inexpensive database performance.</description>
		<content:encoded><![CDATA[<p>I didn&#8217;t think I&#8217;d ever see the day.  I&#8217;ve been working as a DB consultant for a lot of years, and it has been anathema to even suggest denormalization.  The only thing I would add is, know thy database.  Some a very good at handling normalization, but if they are free or inexpensive, they are limited somehow.  You&#8217;ll need to spend real money for real performance with lots of data.  Nobody WANTS to denormalize, but I&#8217;ve been doing it for years when the DB engine required to do the job was not in the cards.  These are the killers for cheap or free databases:<br />
1.  Joins are more than 1 deep.  E.G. Grandfather &#8211; Father &#8211; Son would be 2 deep.  Even Father &#8211; Son, Son slows some of them.<br />
2.  Any time the ORDER BY is composed of columns of more than one table.<br />
3.  Ad-hoc joins such as from QBE tools.  Helpful indexes are available, but they won&#8217;t be used to resolve the query even though they are available because the DB engine doesn&#8217;t have the smarts to use them.<br />
*BTW, you can also spend a ton, and get inexpensive database performance.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tom Passin</title>
		<link>http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/comment-page-1/#comment-1078</link>
		<dc:creator>Tom Passin</dc:creator>
		<pubDate>Tue, 13 May 2008 16:57:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/#comment-1078</guid>
		<description>I used to work a lot with SQL Anywhere, and had a lot of queries with multi-way joins.  Of course, they tended to be very slow - v-e-r-y s-l-o-w in some cases.  I found that often the existing indexes weren&#039;t being used.  It turned out that I could get the optimizer to use them by specifying apparently redundant conditions in a where clause.

IOW, if you can discover how to get the optimizer to help you (which may take some trickery), you can turn an O(n2) or worse query into something quite reasonable.  I don&#039;t know about MySQL - I haven&#039;t needed to make similar queries since I&#039;ve been using it.</description>
		<content:encoded><![CDATA[<p>I used to work a lot with SQL Anywhere, and had a lot of queries with multi-way joins.  Of course, they tended to be very slow &#8211; v-e-r-y s-l-o-w in some cases.  I found that often the existing indexes weren&#8217;t being used.  It turned out that I could get the optimizer to use them by specifying apparently redundant conditions in a where clause.</p>
<p>IOW, if you can discover how to get the optimizer to help you (which may take some trickery), you can turn an O(n2) or worse query into something quite reasonable.  I don&#8217;t know about MySQL &#8211; I haven&#8217;t needed to make similar queries since I&#8217;ve been using it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ira Pfeifer</title>
		<link>http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/comment-page-1/#comment-1077</link>
		<dc:creator>Ira Pfeifer</dc:creator>
		<pubDate>Tue, 13 May 2008 16:42:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/#comment-1077</guid>
		<description>Using clustered indices raises a few other issues as well.  For starters, you can only 1 clustered index per table.  This may seem self-evident, but you&#039;d be surprised how many developers don&#039;t realize it.  

Also, besides the memory issue, another potential performance problem that can be caused by clustered indices involves page splits.  I&#039;ll try to explain as briefly as possible:

The typical clustered index on a table is on the Primary Key, which is usually an INT IDENTITY column.  This key is monotonically increasing, so any new rows inserted will come after all existing rows.  This means that each data page will be filled before creating a new one, so the minimum number of new pages is created and the minimum number of IOs is performed.

If you put a clustered index on something else, the physical data needs to be kept in that order.  So say you&#039;ve put a clustered index on UserId.

UserId  Data
1          a
1          b
1          c
2          a
2          d

If you insert (1,d), that row has to go in between (1,c) and (2,a).  If the data page is full (which is optimal for minimizing IOs and space utilization), then you have to split it, move half the data to the new page, and then insert the row.  

Now imagine how often this is going to happen if you&#039;re regularly inserting rows in the middle of your clustered index.  You&#039;re either going to have significantly fragmented indices, which will be slow, or you&#039;re going to get lots of page splits, which will slow down inserts.  There ARE situations in which a clustered index on this sort of data is warranted, such as when inserts are minimal, but often the best solution for an OLTP database with a balanced workload is a clustered index on the PK and non-clustered indices on the other columns you&#039;re interested in.

Of course, as with everything DB-related, you&#039;ll need to apply these concepts to your specific implementation, but they should be considered.</description>
		<content:encoded><![CDATA[<p>Using clustered indices raises a few other issues as well.  For starters, you can only 1 clustered index per table.  This may seem self-evident, but you&#8217;d be surprised how many developers don&#8217;t realize it.  </p>
<p>Also, besides the memory issue, another potential performance problem that can be caused by clustered indices involves page splits.  I&#8217;ll try to explain as briefly as possible:</p>
<p>The typical clustered index on a table is on the Primary Key, which is usually an INT IDENTITY column.  This key is monotonically increasing, so any new rows inserted will come after all existing rows.  This means that each data page will be filled before creating a new one, so the minimum number of new pages is created and the minimum number of IOs is performed.</p>
<p>If you put a clustered index on something else, the physical data needs to be kept in that order.  So say you&#8217;ve put a clustered index on UserId.</p>
<p>UserId  Data<br />
1          a<br />
1          b<br />
1          c<br />
2          a<br />
2          d</p>
<p>If you insert (1,d), that row has to go in between (1,c) and (2,a).  If the data page is full (which is optimal for minimizing IOs and space utilization), then you have to split it, move half the data to the new page, and then insert the row.  </p>
<p>Now imagine how often this is going to happen if you&#8217;re regularly inserting rows in the middle of your clustered index.  You&#8217;re either going to have significantly fragmented indices, which will be slow, or you&#8217;re going to get lots of page splits, which will slow down inserts.  There ARE situations in which a clustered index on this sort of data is warranted, such as when inserts are minimal, but often the best solution for an OLTP database with a balanced workload is a clustered index on the PK and non-clustered indices on the other columns you&#8217;re interested in.</p>
<p>Of course, as with everything DB-related, you&#8217;ll need to apply these concepts to your specific implementation, but they should be considered.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: m0j0</title>
		<link>http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/comment-page-1/#comment-1073</link>
		<dc:creator>m0j0</dc:creator>
		<pubDate>Tue, 13 May 2008 14:15:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/#comment-1073</guid>
		<description>Heh. Having answered that question is a prerequisite. 

Also, people have differing opinions about when it&#039;s ok to use a database. I&#039;m assuming that the reader has an interest in tuning performance, and is probably dealing with large(r) amounts of data, in which case you probably aren&#039;t going to get better performance with the same flexibility out of some non-database-backed solution. I&#039;d be interested to hear examples of, say, HDF5 or XML or some other file-based mechanism outperforming a database and still being able to do complex queries. I tend to find more situations where people *should* use a database and don&#039;t than the reverse.</description>
		<content:encoded><![CDATA[<p>Heh. Having answered that question is a prerequisite. </p>
<p>Also, people have differing opinions about when it&#8217;s ok to use a database. I&#8217;m assuming that the reader has an interest in tuning performance, and is probably dealing with large(r) amounts of data, in which case you probably aren&#8217;t going to get better performance with the same flexibility out of some non-database-backed solution. I&#8217;d be interested to hear examples of, say, HDF5 or XML or some other file-based mechanism outperforming a database and still being able to do complex queries. I tend to find more situations where people *should* use a database and don&#8217;t than the reverse.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jack Diederich</title>
		<link>http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/comment-page-1/#comment-1072</link>
		<dc:creator>Jack Diederich</dc:creator>
		<pubDate>Tue, 13 May 2008 12:59:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.protocolostomy.com/2008/05/12/a-couple-of-mysql-performance-tips/#comment-1072</guid>
		<description>You missed the first question &quot;should I be using a database for this?&quot;  Most applications don&#039;t but use one anyway.</description>
		<content:encoded><![CDATA[<p>You missed the first question &#8220;should I be using a database for this?&#8221;  Most applications don&#8217;t but use one anyway.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

