Counting The Costs Martin Widlake's Yet Another Oracle Blog
Counting The Costs Martin Widlake's Yet Another Oracle Blog
search
go!
Posted by mwidlake in performance. Tags: explain plan, performance, SQL trackback Anyone who has been around the Oracle DBA or Developer world for a while is sure to have come across people debating(*) the fastest way of selecting the number of rows from a table. Different suggestions have been that count(*) is not as fast as count(pk_column), count(rowid) or count(1). (*) by debating I mean arguing, sometimes getting angry and, on one memorable occasion, pretty much asking each other to step outside to resolve this matter Man-to-Man (why do women not feel the need for this final step? Just more sensible I guess). Bottom line, it makes no difference whether you select count(*), count(1), count(pk) on Oracle 10 (and, I am sure, Oracle 9). However, that is an unsupported claim and you should not believe me yet. Any web site or blog making a solid statement about how Oracle works needs to provide some proof, otherwise it is just opinion. What I find more interesting is that some people seem happy to spend more time debating the question than doing a few tests to find out for sure. Also, proving the answer to such a simple question can actually lead to some more interesting insights. Ive followed the lead of my betters and I have created a script to run through all the below tests. If I can bend my very new blogging skills to it, the sql*plus script can be downloaded from this link. Below follows my proof and some observations on the way. First I better show what version of Oracle I am using and the basic set up as far as sql performance goes (Ive trimmed the output to remove number of rows returned, timing etc where it is not significant) test102> basic setup info test102>select version,archiver,database_status from v$instance; VERSION ARCHIVE DATABASE_STATUS - 10.2.0.1.0 STARTED ACTIVE
Twittering
@UweHesse AAAAARRGGGHHHH!!!!!!!!! :-) 2 days ago @UweHesse +1 - but do you ever ask your
candidates what they want to be taught next? :-) 2 days ago @dw_pete @cdacko I'm just looking forward to steeling your desk, Pete :-) Have a good conference 2 days ago
Interactive is best
test102>select substr(name,1,30) name,substr(value,1,20) value 2 from v$parameter 3 where name in (db_block_size,'compatible,'cpu_count,'db_file_multiblock_read_count 4 ,optimizer_mode,'sga_target,'sort_area_size) 5 order by name 6 / NAME VALUE
I feel Blogs work better if they have comments and discussion threads, so please feel free to comment.
Email Subscription
Enter your email address to subscribe to this blog and get mails about new posts Join 77 other followers
compatible 10.2.0.1.0 cpu_count 1 db_block_size 8192 db_file_multiblock_read_count 16 optimizer_mode ALL_ROWS sga_target 612368384 sort_area_size 65536 Now I create a reasonably sized table with a numeric column ID to be the primary key, column NUM_1 which is a small varying numeric, a unique column ID_2, NUM_2 holding random values between 0 and 99, another small number column NUM_3 and a final numeric column NUM_4 which is sparse, only 1 in 20 records will hold a value. Finally there are three columns VC_1 to VC_3 which just hold strings to pad out the table a little. test102>drop table test_1; Table dropped. Elapsed: 00:00:00.59 test102> test102> create a table with 20k rows, an ascending numeric PK test102> a unique VC test102> a non unique numeric test102> a sparsly populated numeric test102> and some padding fields. test102>create table test_1 2 as select rownum id 3 ,mod(rownum,10) num_1
Sign me up!
Top Posts
Friday Philosophy - Oracle Performance Silver Bullet Why is my SYSTEM Tablespace so Big? Slow querying of DBA_EXTENTS, DBA_FREE_SPACE and dropping tables Rant - Unique means UNIQUE! Argh! What are Consistent Gets? Stuck execution plans - purging a cursor Shrinking Tables to Aid Full Scans Friday Philosophy - Identifying and Nullifying Fake Urgency DBA_TAB_MODIFICATIONS Index Organized Tables - the Basics.
Blogroll
Dion Cho Doug Burns Jason Arneil Jonathan Lewis Kellyn Pot'Vin's DBA Kevlar blog
mwidlake.wordpress.com/2009/05/17/counting-the-costs/
1/7
4/25/12
4 ,A'||to_char(rownum+1000) id_2 6 ,mod(rownum,5) num_3
5 ,trunc(dbms_random.value(1,100)) num_2 7 8 9 10 11 12 13 14 ,decode(mod(rownum,10),0,trunc(dbms_random.value(1,30)) ,null) num_4 ,lpad(A,50,A') vc_1 ,lpad(B,50,B') vc_2 ,lpad(C,250,C') vc_3 from dba_objects where rownum < 20000 /
Pages
About Martin Films That I Like Appearances and Meetings Ora600.ltd Quick Tips
test102>desc test_1 Name Null? Type ID NOT NULL NUMBER NUM_1 NUMBER ID_2 VARCHAR2(41) NUM_2 NUMBER NUM_3 NUMBER NUM_4 NUMBER VC_1 VARCHAR2(50) VC_2 VARCHAR2(50) VC_3 VARCHAR2(250) Now I add a primary key (a supporting unique index is created automatically), a unique index, a non-unique index and an index on the sparse column. Finally, I use the old ANALYZE command so compute statistics on the table and index, so that the Oracle Optimizer has information about the table and its indexes. test102> now add a pK on the ID test102>alter table test_1 2 add constraint t_pk primary key (id); Table altered. test102> add a unique index on the unique varchar test102>create unique index t_uq on test_1(id_2); Index created.
My Web Site
If you want to know more about me, my company Ora600 Ltd, or see what scripts and presentations I have managed to put on my web site then please visit it here
Meta
Register Log in Entries RSS Comments RSS WordPress.com
FEEDS
FULL
COMMENTS
test102> add a non-unique index on the non-unique numeric test102>create index t_nonuq on test_1(num_2); Index created. test102> and finally an index on the sparse column test102>create index t_sparse on test_1(num_4); Index created. test102> now add statistics
test102>analyze table test_1 compute statistics; Table analyzed. That is the setup, now I turn on autotrace (which causes SQL*Plus to show information about the statements executed), timing and Ill do my first test, the traditional select count(*). test102>set autotrace on test102>set timi on test102>set pause off test102> Traditional count(*)
19999 1 row selected. Elapsed: 00:00:00.01 Execution Plan Plan hash value: 454320086 | Id | Operation | Name | Rows | Cost (%CPU)| Time | 0 | SELECT STATEMENT | 1 | SORT AGGREGATE | | | | 1| 1| 11 |
(0)| 00:00:01 | | |
(0)| 00:00:01 |
mwidlake.wordpress.com/2009/05/17/counting-the-costs/
2/7
4/25/12
53 consistent gets 0 physical reads 0 db block gets
381 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk)
1 rows processed What does this show? Well, there are 19,999 rows, it took 1 millisecond (ms) to do the count. The plan is simple, there was a fast full scan of the index T_PK, cost was 11 and it took 1ms The statistics show 44 recursive calls, 0 block reads (this is physical reads from disc, as far as Oracle is concerned) and 53 consistent gets, which is reads from memory. Ill ignore the rest for now. the results are stable. Ill do it again. test102>select count(*) from test_1; 19999 1 row selected. Elapsed: 00:00:00.00 Execution Plan Plan hash value: 45432008 COUNT(*) However, I have done one test on one statement. I was taught to be a scientist at college, one isolated test is not enough, I do not know if
| Id | Operation | Name | Rows | Cost (%CPU)| Time | 0 | SELECT STATEMENT | 1 | SORT AGGREGATE | | | | 1| 1| 11
| 2 | INDEX FAST FULL SCAN| T_PK | 19999 | 11 Statistics 0 recursive calls 0 db block gets 46 consistent gets 0 physical reads 0 redo size
(0)| 00:00:01 | | |
(0)| 00:00:01 |
413 bytes sent via SQL*Net to client 381 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
Well the result and the plan are the same, but now the code takes less than 1ms, there are no recursive calls and consistent gets drop from 53 to 46. The recursive calls and the extra consistent gets in the first statement are due to Oracle doing some parsing work on the
statement it has checked the syntax, looked up the objects in the data dictionary and come up with an execution plan (I have to be honest, this is not quite true, I have run the statement several times already so some of that work was done before, thus the parse work in this example is a little light).
The lesson here is, treat the first execution of a statement as a special case. The second and further executions give better stats on the execution of the statement. Run the select count(*) from test_1 half a dozen times yourself and you should see the stats are stable. Now, select count(id) from test_1. Column ID is a primary key test102> now count on the Primary key test102>select count(id) from test_1; COUNT(ID) 19999 1 row selected. Elapsed: 00:00:00.00 Execution Plan Plan hash value: 454320086 | Id | Operation | Name | Rows | Cost (%CPU)| Time |
mwidlake.wordpress.com/2009/05/17/counting-the-costs/
3/7
4/25/12
| 0 | SELECT STATEMENT | | 1 | 11 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1| | | 2 | INDEX FAST FULL SCAN| T_PK | 19999 | 11 Statistics 1 recursive calls 0 db block gets 46 consistent gets
| (0)| 00:00:01 |
0 physical reads 0 redo size 414 bytes sent via SQL*Net to client 381 bytes received via SQL*Net from client 2 0 0 1 SQL*Net roundtrips to/from client sorts (memory) sorts (disk) rows processed
That is spookily like the second run of count(*). Remembering my comments on the first run, Lets do a second run of the code. test102>select count(id) from test_1; COUNT(ID) 19999 1 row selected. Elapsed: 00:00:00.00 Execution Plan Plan hash value: 454320086 | Id | Operation | Name | Rows | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 11 (0)| 00:00:01 | 1 | SORT AGGREGATE | | 1| | | | 2 | INDEX FAST FULL SCAN| T_PK | 19999 | 11 (0)| 00:00:01 | Statistics 0 0 46 0 recursive calls db block gets consistent gets physical reads
0 redo size 414 bytes sent via SQL*Net to client 381 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
Yep, exact same plan and exact same stats as for count(*). Why is there only a difference of 1 recursive call between the two runs? The Oracle kernel has spotted that the query can be satisfied with the same plan, so it just allocates it to the cursor (ie uses it for this statement). Note, the plan hash value is the same. check I am not deleting anything significant. test102> count on a static value COUNT(1) test102>select count(1) from test_1; 19999 Now to try count(1). For brevity I am going to trim the output of things that do not change. You have the script to run yourself so you can
| Id | Operation | Name | Rows | Cost (%CPU)| Time | | 0 | SELECT STATEMENT | | 1 | 11 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1| | | | 2 | INDEX FAST FULL SCAN| T_PK | 19999 | 11 -
(0)| 00:00:01 |
mwidlake.wordpress.com/2009/05/17/counting-the-costs/
4/7
4/25/12
Statistics 1 recursive calls 0 db block gets 46 consistent gets
Exactly the same as the first iteration of select count(id). The plan hash value is the same too. Second run? test102>select count(1) from test_1; 19999 COUNT(1)
| 0 | SELECT STATEMENT | | 1 | 11 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1| | | | 2 | INDEX FAST FULL SCAN| T_PK | 19999 | 11 (0)| 00:00:01 | Statistics 0 recursive calls 0 db block gets 46 consistent gets
I feel that is pretty good supporting evidence that count(*), count(id) and count(1) are all being executed the same and are equally efficient. As a final test for now, how about select count(any old text)? test102>count on a larger, text static value test102>select count(any old text) from test_1; COUNT(ANYOLDTEXT) 19999 Execution Plan Plan hash value: 454320086 -
| Id | Operation | Name | Rows | Cost (%CPU)| Time | | 0 | SELECT STATEMENT | | 1 | 11 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1| | | | 2 | INDEX FAST FULL SCAN| T_PK | 19999 | 11 Statistics 1 recursive calls 0 db block gets 46 consistent gets test102>select count(any old text) from test_1; COUNT(ANYOLDTEXT) 19999 Execution Plan Plan hash value: 454320086 | Id | Operation | Name | Rows | Cost (%CPU)| Time |
(0)| 00:00:01 |
| 0 | SELECT STATEMENT | | 1 | 11 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1| | | | 2 | INDEX FAST FULL SCAN| T_PK | 19999 | 11 (0)| 00:00:01 | Statistics 0 recursive calls 0 db block gets 46 consistent gets
Tomorrow I will continue on the theme but that is enough for now.
mwidlake.wordpress.com/2009/05/17/counting-the-costs/
5/7
4/25/12
Comments
1. PdV - May 18, 2009
Hello Martin! And a nice, simple, understandable and repeatable blogpost. At risk of debating the topic to death: My guess is that any Unique Index could be used to count the records. It would be interesting to see if a smaller UK index (it may need an explicit constraint) would be chosen over a larger (overloaded, inflated) PK index, assuming a table had those kind of indexes. As for displaying SQL and Code-text in blogs, in the blogspot area the
t a gw o r k so k ,b u ti ts t i l lr e m o v e sw h i t e s p a c e . A n yi d e ao fb e t t e rw a y st od i s p l a yc o d ei n" c o u r i e r "f o n t? H o wd ot h ep r o sd ot h a t ?
A n y w a y s ,K e e pW r i t i n g !
P d V
Reply
3. Counting the Cost Martin Widlakes Yet Another Oracle Blog - May 22, 2009
[...] this first posting for the details of setting up the table and showing that select count(*), count(1) and count(pk) [...] Reply
4. Counting the Cost #4 The speedy way Martin Widlakes Yet Another Oracle Blog - May 26, 2009
[...] far I have shown with this posting here that the myths and arguments about whether count(*), count(1), count(pk) etc are immaterial [...] Reply
mwidlake.wordpress.com/2009/05/17/counting-the-costs/
6/7
4/25/12
Leave a Reply
Enter your comment here...
Top
mwidlake.wordpress.com/2009/05/17/counting-the-costs/
7/7