What if we use a subquery to fetch the URL for the highest numbered headline?
mysql> EXPLAIN SELECT Url FROM Headline WHERE Id IN (SELECT MAX(Id) FROM Headline);
After waiting five minutes, we killed the query. Either we did something wrong, or MySQL wasn’t using the obvious approach to resolve this query. Hmm.
To find out, let’s explain it.
mysql> EXPLAIN SELECT Url FROM Headline WHERE Id IN (SELECT MAX(id) FROM Headline) \G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: Headline
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 302116
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBSELECT
table: Headline
type: index
possible_keys: NULL
key: PRIMARY
key_len: 4
ref: NULL
rows: 302116
Extra: Using index
2 rows in set (0.00 sec)
Yikes!
MySQL isn’t using any indexes! Notice that both possible_keys fields are NULL. Is this a bug? Perhaps, especially when you consider that the key field in the dependent subselect says that it selected the primary key. But it wasn’t in the list of possible keys. And, worse yet, MySQL believes it must examine 302,116 rows to resolve a single-record lookup supposedly based on a primary key.
Of course, this testing was performed with MySQL 4.1.0 alpha, prerelease code in which the query optimizer hadn’t been properly tuned to handle subselects well.[1] The point isn’t that MySQL didn’t do the right thing. No matter how well tuned it is, MySQL will make a bad decision once in a while. When it does, you need to be able to diagnose the problem and, in some cases, come up with a workaround.
So let’s rewrite the query a bit to simplify things. We’re using IN(…) in a query that can only return one row. So let’s change that to an equality (=) test.
mysql> SELECT Url FROM Headline WHERE Id = (SELECT MAX(id) FROM Headline);
+———————————————-+
| Url |
+———————————————-+
| http://biz.yahoo.com/bw/030331/315850_1.html |
+———————————————-+
1 row in set (0.00 sec)
mysql> EXPLAIN SELECT Url FROM Headline WHERE Id = (SELECT MAX(id) FROM Headline) \G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: Headline
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra:
*************************** 2. row ***************************
id: 2
select_type: SUBSELECT
table: NULL
type: NULL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra: Select tables optimized away
2 rows in set (0.00 sec)
A-ha! That did it. The query ran in a split second.
