Bobby Durrett's DBA Blog

Subscribe to Bobby Durrett's DBA Blog feed
Oracle database performance
Updated: 12 hours 10 min ago

Outline Hint from dbms_xplan does not contain PARALLEL hint

Fri, 2021-09-03 12:36

I was tuning a parallel query and was a little surprised to find that an outline hint from EXPLAIN PLAN and dbms_xplan.display did not contain a PARALLEL hint. So, a query I was testing did not run in parallel with the outline hint. I had to add the PARALLEL hint. It is surprising because I think of an outline hint as a full set of hints.

I built a testcase to demonstrate this. Here is a zip of its script and output on a 19c database: blogparalleloutlinehints.zip.

I created a small test table and got the outline for the plan of a select statement with a parallel hint:

drop table test;

create table test noparallel as select * from dba_tables;

explain plan into plan_table for 
select /*+ parallel(test,8) */ sum(blocks) from test;

set markup html preformat on

select * from table(dbms_xplan.display('PLAN_TABLE',NULL,'ADVANCED'));

The outline hint looks like this:

  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "TEST"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Notice that it just has a FULL hint and nothing about running the query in parallel.

Next, I ran the query with the outline hint but without any parallel hint like this:

select   /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "TEST"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */ sum(blocks) from test;

It ran with a serial (non-parallel) plan:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |    27 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| TEST |  1785 |  5355 |    27   (0)| 00:00:01 |
---------------------------------------------------------------------------

Then I ran the query with both a parallel hint and the same outline hint:

select   /*+ parallel(test,8)
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "TEST"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */ sum(blocks) from test;

It ran the desired parallel plan:

-------------------------------------------...---------------------
| Id  | Operation              | Name     |...|IN-OUT| PQ Distrib |
-------------------------------------------...---------------------
|   0 | SELECT STATEMENT       |          |...|      |            |
|   1 |  SORT AGGREGATE        |          |...|      |            |
|   2 |   PX COORDINATOR       |          |...|      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |...| P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |...| PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |...| PCWC |            |
|*  6 |       TABLE ACCESS FULL| TEST     |...| PCWP |            |
-------------------------------------------...---------------------

In other tuning situations with plans that did not execute in parallel the outline hints from dbms_xplan.display_cursor did a great job of capturing all the details of the plan. But when extracting an outline hint from a query that has a parallel hint in it, I needed both the outline hint and the parallel hint to get the same plan.

Bobby

Categories: DBA Blogs

Tuning Example With Outline Hints and SQL Profile

Tue, 2021-08-24 17:22

I tuned a production query this week using outline hints and a SQL profile. This is like other examples I have posted but may have enough differences to be worth documenting here.

A query in a batch job was taking over an hour to run. I got it to run in under 30 minutes.

Old plan

898524 rows selected.

Elapsed: 01:13:34.35

New plan

899018 rows selected.

Elapsed: 00:28:00.28

The new plan ran about 45 minutes faster.

The main point here is that I was able to improve the plan in a more selective way using an outline hint than I could have with regular hints.

I ran the problem query using my test2.sql script to show the expected and actual number or rows in each step in the plan and to show the amount of time spent on each step. The output looked like this:

This is the typical example of a bad plan using an index in a case where a full scan is more efficient. Most of the time is spent in step 4 doing 898 thousand table lookups. So, to fix this query I tried a FULL hint to get rid of the SKIP SCAN, but it did not help. I tried various combinations of FULL hints on the four tables in the plan and they all resulted in inefficient plans with different types of joins than the original plan.

Full hints:

SELECT /*+ FULL(DD) FULL(ORG) FULL(WINDOW1) FULL(WINDOW2) */
       DD.ROW_ID
...

Plan:

-----------------------------------------------
| Id  | Operation              | Name         |
-----------------------------------------------
|   0 | SELECT STATEMENT       |              |
|*  1 |  HASH JOIN RIGHT OUTER |              |
|*  2 |   TABLE ACCESS FULL    | S_ORG_EXT_XM |
|*  3 |   HASH JOIN RIGHT OUTER|              |
|*  4 |    TABLE ACCESS FULL   | S_ORG_EXT_XM |
|*  5 |    HASH JOIN           |              |
|*  6 |     TABLE ACCESS FULL  | S_ORG_EXT    |
|*  7 |     TABLE ACCESS FULL  | S_ORG_EXT_XM |
-----------------------------------------------

Then I got the idea of using an outline hint. The plan.sql script that I use to run EXPLAIN PLAN on a query outputs an outline hint. In this case the outline hint looks like this:

The main idea that I wanted to write this post about is to take this outline hint and leave everything else the same but to replace the skip scan with a full table scan. Adding a FULL hint was changing a lot of things about the plan. But if I start with the outline hint, I can force the plan to remain the same except for the one thing I change. I ran plan.sql with the query with a full hint to get the outline with a full scan and then pulled out just the full scan line to put into the original plan’s outline. Here are the old and new lines that I used to edit to outline hints for the original query’s plan, an index skip scan and a full table scan.

Old:

INDEX_SS(@"SEL$1" "DD"@"SEL$1" ("S_ORG_EXT_XM"."PAR_ROW_ID" "S_ORG_EXT_XM"."TYPE"
        "S_ORG_EXT_XM"."NAME" "S_ORG_EXT_XM"."CONFLICT_ID"))

New:

FULL(@"SEL$1" "DD"@"SEL$1")

Here is the edited version of the outline hint:

So, then I ran my test2.sql script but with the edited outline hint and the query ran in 28 minutes. So, all we had to do was put the outline hint into the existing SQL script.

But then I realized that I could finish off the performance improvement using a SQL Profile and not have to edit the script. The output for the run of my test2.sql script with the hint had this sql_id and plan_hash_value: 3m80wvb2v6vdq 42036438.

My slow production query had this sql id and plan hash value: dzfr3vz7z5p66 2238264355.

So, I passed these two sets of parameters to coe_xfr_sql_profile.sql in the sqlt/utl subdirectory creating these two scripts:

coe_xfr_sql_profile_3m80wvb2v6vdq_42036438.sql

and

coe_xfr_sql_profile_dzfr3vz7z5p66_2238264355.sql

So, I hacked the two scripts together to make coe_xfr_sql_profile_dzfr3vz7z5p66_42036438.sql which has the SQL text of the original query and the plan that my test script with the outline hint used. The scripts output by coe_xfr_sql_profile.sql save the text of the SQL statement and its plan when creating a new SQL Profile. You can get a little creative and connect production SQL text with a plan you have generated through hints.

  9  DBMS_LOB.CREATETEMPORARY(sql_txt, TRUE);
 10  DBMS_LOB.OPEN(sql_txt, DBMS_LOB.LOB_READWRITE);
 11  -- SQL Text pieces below do not have to be of same length.
 12  -- So if you edit SQL Text (i.e. removing temporary Hints),
 13  -- there is no need to edit or re-align unmodified pieces.
 14  wa(q'[SELECT DD.ROW_ID|| '||' ||
 15  DD.PAR_ROW_ID|| '||' ||

text of original query starts line 14

 55  AND WINDOW2.X_DELIV]');
 56  wa(q'[ERY_WINDOW  (+) = '2'
 57  AND WINDOW2.TYPE (+)= 'DELIVERY TIMES']');
 58  DBMS_LOB.CLOSE(sql_txt);
 59  h := SYS.SQLPROF_ATTR(
 60  q'[BEGIN_OUTLINE_DATA]',
 61  q'[IGNORE_OPTIM_EMBEDDED_HINTS]',

text of query ends line 57. outline hint with full scan starts line 60

 80  q'[USE_MERGE(@"SEL$1" "ORG"@"SEL$1")]',
 81  q'[END_OUTLINE_DATA]');
 82  :signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
 83  :signaturef := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt, TRUE);

ends line 81

So, I ran the edited coe_xfr_sql_profile_dzfr3vz7z5p66_42036438.sql to put in the SQL Profile which related the production query to the plan generated by the edited outline hint and the production job ran in 28 minutes last night without any change to its SQL code. The outline hint allowed me to leave the original plan intact except for the one step that needed a full scan instead of an index skip scan. Then the SQL Profile trick allowed me to apply the better plan without any change to the application.

Bobby

P.S. It looks like I didn’t have to hack together the sql profile script the way I did. See this post: https://www.bobbydurrettdba.com/2014/03/19/using-hints-with-coe_xfr_sql_profile-sql/

I could have just extracted the SQL Profile script like this:

coe_xfr_sql_profile.sql dzfr3vz7z5p66 42036438

That would have generated oe_xfr_sql_profile_dzfr3vz7z5p66_42036438.sql automatically combining the production SQL text with the plan with the outline hint and full scan instead of me editing it together manually.

Categories: DBA Blogs

Two New ASH FORCE_MATCHING_SIGNATURE scripts

Thu, 2021-07-15 17:40

I just put two new scripts on my OracleDatabaseTuningSQL GitHub repository. These came out of a PeopleSoft Financials performance problem I was working on that involved a lot of similar SQL statements that used constants instead of bind variables, so they did not show up at the top of the AWR report. I had to look at ASH data to find them and had to group by their force matching signature to group the similar statements together. These are the two scripts:

ashfmscount.sql – Looks at a single application engine session and groups all the time spent by force matching signature to find the queries that consumed the most time. I used my simple ashdump.sql script to dump out a few rows when I knew the app engine was running and I found the SESSION_ID and SESSION_SERIAL# values there.

ashtopelapsed.sql – This is meant to look like the SQL by elapsed time report on an AWR report except that it groups SQL by force matching signature but gives an example sql id with its text to give you an idea of what the signature represents. Might be good to run this the next time my AWR report does not have any long running SQL statement on the top SQL report.

I really used the first one to resolve the issue along with various other scripts to get to that point. I created the second one just now as a possible future script to use in addition to an AWR report. I didn’t check the ASH report to see if this is a duplicate of it, but these two new scripts work well.

Bobby

Categories: DBA Blogs

Plan Change Due to Index Partition Names Mismatch

Wed, 2021-07-14 14:51

I want to document a bug I ran across in Oracle 11.2.0.3 on HP-UX Itanium and how it caused unexpected plan changes that would bring the database to its knees with many long running queries running the bad plan. I found that the problem table had a local index whose partition names did not match the partition names of the table and that recreating the index with partition names that matched the table resolved the issue.

We have a busy and important Oracle database that had queries that ran fine for weeks and months and then suddenly changed to a bad plan. They all had a similar pattern. The good plan did a simple range scan on a certain partition’s index. The bad plan did a fast full index scan.

Good:

PARTITION RANGE SINGLE | 
 INDEX RANGE SCAN | PROBLEM_INDEX

Bad:

PARTITION RANGE SINGLE | 
 INDEX FAST FULL SCAN | PROBLEM_INDEX

The query accessed the table using the first two columns of PROBLEM_INDEX and the first column of the index is the partitioning column for the range partitioned table. So, call the first two columns of the index PARTITIONING_C and OTHER_C then the query was like:

select
...
from problem_table
where
PARTITIONING_C = :bindvar1 and
OTHER_C = :bindvar2
...

I have known for a long time that some choice of bind variable values was causing the plan to flip to the fast full scan on the index partition. But I did not know that it had to do with the names of the index’s partitions. The table and index creation scripts must have looked something like this:

create problem_table
(
PARTITIONING_C NUMBER,
OTHER_C NUMBER,
...
)
PARTITION BY RANGE (PARTITIONING_C)
(
PARTITION P1000 VALUES LESS THAN (1000),
PARTITION P2000 VALUES LESS THAN (2000),
PARTITION P3000 VALUES LESS THAN (3000),
PARTITION P4000 VALUES LESS THAN (4000),
PARTITION P5000 VALUES LESS THAN (5000));

CREATE UNIQUE INDEX problem_index ON problem_table
(PARTITIONING_C, OTHER_C , ...)
LOCAL (  
  PARTITION P2000,
  PARTITION P3000,
  PARTITION P4000,
  PARTITION P5000,
  PARTITION P6000);

The index has the same number of partitions as the table and a lot of them have the same names, but they do not line up. I found this bug in the version of Oracle that we are on:

Bug 14013094 – DBMS_STATS places statistics in the wrong index partition

I am not sure that I am hitting that bug, but I am hitting some similar bug because 14013094 relates to index partitions names that do not match table partition names. For one partition of the problem index the statistics were set to 0 but its corresponding table partition had millions of rows. It would be as if partition P3000 on problem_table had 20,000,000 rows in stats and corresponding partition P4000 on problem_index had 0 rows. If I gathered index partition statistics on P4000 it correctly set stats to 20,000,000. If I gathered table partition statistics on P3000 it cleared the index partition stats on P4000 setting them to 0! (!!) Yikes! How weird is that? Seems obvious to me it is a bug, but maybe not exactly 14013094. I tried dropping and recreating the index leaving the partition names as they are, but it did not resolve the issue. Then I just created the index letting it default to matching partition names like this:

CREATE UNIQUE INDEX problem_index ON problem_table
(PARTITIONING_C, OTHER_C , ...)
LOCAL;

I’m not sure how the partition names got mismatched but it is a simple fix. It took me a while staring at the partition statistics to realize what was going on and then it took a while to prove out the fix. We do not yet have this in production, but I believe we have nailed down the cause of the plan changes. In the past I have been using SQL Profiles to lock in the plans of any new query that uses the problem table, but I plan to run without them after putting in the index. I kick myself for not researching this earlier, but it was not obvious so maybe it was not crazy to use SQL Profiles for a while. But it became a burden to keep using them and it left the system vulnerable to a plan change on any query on the problem table that did not already have a SQL Profile.

In a lot of ways this is a simple query tuning problem with bad zero stats on a partition with millions of rows. Any time the statistics are wildly inaccurate bad plans are likely. But tracking down the reason for the zero statistics was not simple. I should mention that some of the table partitions are empty and have 0 rows. In fact, to take our example, imagine that table partition P4000 has 0 rows just like index partition P4000. No one would suspect that the statistics are wrong unless you realize that P4000 on the index corresponds not to P4000 but to P3000 on the table and P3000 has 20,000,000 rows!

Bind variables lead to these unexpected plan changes when the plan is different when different values are passed into the variables. It is a fundamental tradeoff of using bind variables to reduce the parsing that constants would cause. If one set of bind variable values causes a bad plan that plan gets locked into memory and it is used for all the other values passed to the query until the plan is flushed out of memory. There are some cases where the optimizer will look at the values passed into the bind variables and choose between multiple plans but that did not occur in this problem.

So, what can someone get from this post if they are not on 11.2.0.3 and do not have locally partitioned indexes with similar partition names to the table but offset and mismatched? I think for me the point is after using a SQL Profile to resolve a problem query plan to dig deeper into the underlying reason for the bad plan. I expect bad plans because I think that the optimizer is limited in its ability to correctly run a query even if it has the best possible statistics. Also, since I support many databases, I do not have time to dig deeply into the underlying reason for every bad plan. But I need to dig deeper when similar queries keep getting the same kind of bad plans. In many cases bad statistics lead to bad plans and it is a lot better to fix the statistics once than to keep using SQL Profiles and hints repeatedly to fix similar queries with the same sort of switches to bad plans. In this case I had the bizarre fix of recreating the index with partition names that match the table and that ensured that the statistics on the index partitions were accurate and that I no longer need SQL Profiles to lock in the good plan.

Bobby

Categories: DBA Blogs

MongoDB test server on Oracle Linux

Fri, 2021-07-02 11:41

I use Oracle Enterprise Linux on VirtualBox running on my Windows 10 laptop for test servers (virtual machines) of various types of software. I just setup a MongoDB VM yesterday and thought I would document some of the things I did which are not in the standard documentation.

I followed this URL for the install:

https://docs.mongodb.com/manual/tutorial/install-mongodb-on-red-hat/

I created the yum repository file mongodb-org-4.4.repo as documented:

[root@mongodb yum.repos.d]# cat mongodb-org-4.4.repo
[mongodb-org-4.4]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/4.4/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-4.4.asc

I installed the MongoDB yum package:

[root@mongodb yum.repos.d]# sudo yum install -y mongodb-org
Loaded plugins: langpacks, ulninfo
mongodb-org-4.4                                                                                                           | 2.5 kB  00:00:00
ol7_UEKR4                                                                                                                 | 2.5 kB  00:00:00
ol7_latest                                                                                                                | 2.7 kB  00:00:00
mongodb-org-4.4/7Server/primary_db                                                                                        |  47 kB  00:00:02
Resolving Dependencies
--> Running transaction check
---> Package mongodb-org.x86_64 0:4.4.6-1.el7 will be installed
--> Processing Dependency: mongodb-org-shell = 4.4.6 for package: mongodb-org-4.4.6-1.el7.x86_64
...

I didn’t need sudo since I was root, but it worked. I don’t know if this was really needed but I set ulimit with these commands:

ulimit -f unlimited
ulimit -t unlimited
ulimit -v unlimited
ulimit -l unlimited
ulimit -n 64000
ulimit -m unlimited
ulimit -u 64000

I am not sure if these commands stick when you run them as root. They seem to but for now I’ve been running them manually after I reboot. These were documented here:

https://docs.mongodb.com/manual/reference/ulimit/

Based on this document I also created the file /etc/security/limits.d/99-mongodb-nproc.conf:

[root@mongodb ~]# cat /etc/security/limits.d/99-mongodb-nproc.conf
*          soft    nproc     64000
*          hard    nproc     64000
root       soft    nproc     unlimited
[root@mongodb ~]#

I don’t know for sure if this was needed, but it did not cause any problems.

I edited /etc/selinux/config to prevent SELinux from interfering:

[root@mongodb selinux]# diff config.07012021 config
7c7
< SELINUX=enforcing
---
> SELINUX=disabled

I also disabled the firewall just in case it would cause problems:

[root@mongodb ~]# systemctl disable firewalld
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
[root@mongodb ~]# systemctl stop firewalld
[root@mongodb ~]# systemctl status firewalld
? firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)
...

Lastly, in order to be able to access MongoDB from outside the VM I had to edit /etc/mongod.conf to allow access from all IP addresses:

[root@mongodb etc]# diff  mongod.conf mongod.conf.07012021
29c29
<   bindIp: 0.0.0.0 # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.
---
>   bindIp: 127.0.0.1  # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.

Of course, in a production system you would want to make this more secure, but this is just a quick and dirty test VM.

Finally, this command brings up the database:

systemctl start mongod

I ran this in a new root Putty window to get the ulimit settings. Not sure if that was necessary, but it did work.

I have a NAT network and port forwarding setup so that while MongoDB listens by default on port 27017 host localhost I setup VirtualBox to connect it to port 61029 host 127.0.0.1 on my laptop.

Since the programming language that I am most familiar with is Python (I have not learned any JavaScript) I setup a test connection to my new MongoDB database using the pymongo module.

I installed it like this:

pip install pymongo[srv]

Simple test program looks like this:

from pymongo import MongoClient
from pymongo.errors import ConnectionFailure

client = MongoClient('127.0.0.1', 61029)
try:
    # The ismaster command is cheap and does not require auth.
    client.admin.command('ismaster')
except ConnectionFailure:
    print("Server not available")

I got that from stackoverflow. I was also following the pymongo tutorial:

https://pymongo.readthedocs.io/en/stable/tutorial.html

One test program:

from pymongo import MongoClient

client = MongoClient('127.0.0.1', 61029)

db = client['test-database']

collection = db['test-collection']

import datetime
post = {"author": "Mike",
          "text": "My first blog post!",
          "tags": ["mongodb", "python", "pymongo"],
          "date": datetime.datetime.utcnow()}
          
posts = db.posts
post_id = posts.insert_one(post).inserted_id
print(type(post_id))
print(post_id)

print(db.list_collection_names())

import pprint
pprint.pprint(posts.find_one())

pprint.pprint(posts.find_one({"author": "Mike"}))

pprint.pprint(posts.find_one({"author": "Eliot"}))

pprint.pprint(posts.find_one({"_id": post_id}))

post_id_as_str = str(post_id)
pprint.pprint(posts.find_one({"_id": post_id_as_str}))

from bson.objectid import ObjectId

pprint.pprint(posts.find_one({"_id": ObjectId(post_id_as_str)}))

Its output:

<class 'bson.objectid.ObjectId'>
60df400f53f43d1c2703265c
['test-collection', 'posts']
{'_id': ObjectId('60de4f187d04f268e5b54786'),
 'author': 'Mike',
 'date': datetime.datetime(2021, 7, 1, 23, 26, 16, 538000),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}
{'_id': ObjectId('60de4f187d04f268e5b54786'),
 'author': 'Mike',
 'date': datetime.datetime(2021, 7, 1, 23, 26, 16, 538000),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}
None
{'_id': ObjectId('60df400f53f43d1c2703265c'),
 'author': 'Mike',
 'date': datetime.datetime(2021, 7, 2, 16, 34, 23, 410000),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}
None
{'_id': ObjectId('60df400f53f43d1c2703265c'),
 'author': 'Mike',
 'date': datetime.datetime(2021, 7, 2, 16, 34, 23, 410000),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}

I ran this on Python 3.9.6 so the strings like ‘Mike’ are not u’Mike’. It looks like the output on the tutorial is from some version of Python 2, so you get Unicode strings like u’Mike’ but on Python 3 strings are Unicode by default so you get ‘Mike’.

Anyway, I didn’t get any further than getting MongoDB installed and starting to run through the tutorial, but it is up and running. Might be helpful to someone else (or myself) if they are running through this to setup a test VM.

Bobby

Categories: DBA Blogs

Real World Index Example

Tue, 2021-05-25 11:12

This is a follow up to my previous post with a real world example of how much difference indexes can make on tables that do not have them.

I looked at an AWR report of a peak hour on a busy database and noticed sql_id 028pmgvcryk5n:

Peak Hour No Indexes

The top SQL is a PL/SQL call so it does not have a plan but 028pmgvcryk5n stood out because it was taking 11.24% of the total elapsed time and running over 2 seconds (2.23) per execution. Most queries on this system run in hundredths of a second like .02 seconds.

I looked at this query’s plan and it included two tables with full scans on both:

I put indexes on each table and the new plan looks like this:

With indexes it ran the query in about 1 millisecond:

Without indexes it ran from 600 to 2000 milliseconds:

I guess I have a simple point. Indexes can make a big difference. A query that runs in 600 milliseconds may be good enough in some cases but if it can run in 1 millisecond it is worth putting on the right indexes.

Bobby

p.s. I used my sqlstat.sql and vsqlarea.sql scripts to get the execution history with and without the indexes. I used getplans.sql to get the plan without indexes from the AWR and extractplansqlid.sql to get the plan with the indexes from the V$ views. The query ran too fast to show up on the AWR once the indexes were in place so that is why I used the V$ queries to get information about the query after adding the indexes.

Categories: DBA Blogs

Simple Index Example

Thu, 2021-05-20 17:57

There is a lot to be said about when to add indexes on Oracle tables, but I wanted to show a simple example. In this case we have a table with no indexes and a query with two equal conditions in the where clause which identify a single row out of many. Here is a zip of the SQL*Plus script and log for this post: zip

The table TEST is a clone of DBA_TABLES. I load it up with a bunch of copies of DBA_TABLES and with a single row that is unique. Then I run the query without any indexes on the table:

SQL> select TABLESPACE_NAME
  2  from test
  3  where
  4  owner='TESTOWNER' and
  5  table_name = 'TESTTABLE';

TABLESPACE_NAME
------------------------------
TESTTABLESPACE

Elapsed: 00:00:00.15

I add an index on owner and table_name and run it again:

SQL> select TABLESPACE_NAME
  2  from test
  3  where
  4  owner='TESTOWNER' and
  5  table_name = 'TESTTABLE';

TABLESPACE_NAME
------------------------------
TESTTABLESPACE

Elapsed: 00:00:00.00

This may not seem like a big deal going from .15 seconds to .00 seconds (less than .01 seconds). But if you start running a query like this hundreds of thousands of times per hour the .15 seconds of CPU per execution can slow your system down.

See the zip for the details. The create index command looked like this:

SQL> create index testi on test (owner, table_name);

Index created.

Bobby

Categories: DBA Blogs

Fixed Broken Links

Wed, 2021-04-28 11:48

I had a couple of new comments on older blog pages and I noticed that some links pointed to things which no longer exist. I fixed a few things that I knew were wrong. Finally, this week I decided to review every post back to the beginning 9 years ago and click on every link.

For dead links I removed the link and added a comment like (DOES NOT EXIST). In some cases, the link still existed but was different. I changed several links to Oracle’s documentation for example. In many cases I put (updated) or something like that to show that there was a new link. I synced up a lot of the old links to the current version of my scripts on GitHub. Hopefully these changes won’t make the older posts unreadable, but at least the links will point to useful and up to date versions of things.

I did not carefully read every post because I was in a hurry, but I did look at every post and it gave me the chance to see how things changed over the past 9 years. It was kind of a quick review of what I was thinking. Some of the posts seemed kind of dumb. (What was I thinking?) But others are genuinely useful. But it was interesting to see which topics I talked about and how that changed over time. It makes me wonder where things will go in the future. I guess I cannot expect to fully predict the future, but it is good to think about what I should be learning and what skills I should be developing as things go forward.

Anyway, hopefully the updated links will make the posts a little more helpful to myself and others.

Bobby

Categories: DBA Blogs

SymPy Tutorial Repository

Fri, 2021-04-09 11:02

I have been playing with the Python SymPy package and created a repository with my test scripts and notes:

https://github.com/bobbydurrett/sympytutorial

Might be helpful to someone. I just got started.

I had used Maxima before. SymPy and Maxima are both what Wikipedia calls “Computer Algebra Systems.” They have a nice list here:

https://en.wikipedia.org/wiki/List_of_computer_algebra_systems

I got a lot of use out of Maxima but I think it makes sense to switch the SymPy because it is written in Python and works well with other mainstream Python packages that I use like Matplotlib. They both fall under the SciPy umbrella of related tools so for me if I need some computer algebra I probably should stick with SymPy.

Maxima and SymPy are both free.

Bobby

Categories: DBA Blogs

ORA-14767 when day of month > 28 with interval partitioning month interval

Wed, 2021-04-07 18:07
SQL> CREATE TABLE test(
  2  RUN_DATE DATE,
  3  MY_NBR NUMBER(4)
  4  )
  5  PARTITION BY RANGE (RUN_DATE)
  6  INTERVAL (NUMTOYMINTERVAL(1, 'MONTH'))
  7  (
  8    PARTITION data_p1 VALUES LESS THAN (TO_DATE('01/29/2017', 'MM/DD/YYYY'))
  9  );
CREATE TABLE test(
*
ERROR at line 1:
ORA-14767: Cannot specify this interval with existing high bounds


SQL> 
SQL> CREATE TABLE test(
  2  RUN_DATE DATE,
  3  MY_NBR NUMBER(4)
  4  )
  5  PARTITION BY RANGE (RUN_DATE)
  6  INTERVAL (NUMTOYMINTERVAL(1, 'MONTH'))
  7  (
  8    PARTITION data_p1 VALUES LESS THAN (TO_DATE('01/28/2017', 'MM/DD/YYYY'))
  9  );

Table created.

Creating a range partitioned table with a date type partitioning column and a month interval must have a starting partition that has a day < 29 or it gets an ORA-14767 error.

The error message “Cannot specify this interval with existing high bounds” is not helpful. How about something like “Need a day of the month that exists in every month”? February only has 28 days most years so 28 is the max.

Bobby

Categories: DBA Blogs

Free Machine Learning Class from MIT

Thu, 2021-04-01 14:05

I noticed this new class from MIT:

https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-036-introduction-to-machine-learning-fall-2020/

It is about machine learning and is free. I think it has some built in exercises with automatic grading but no instructor to interact with.

Since ML is such a hot topic I thought I would share it. I have not taken the class.

Bobby

Categories: DBA Blogs

60,000 sessions in 64 gigabyte VM using shared servers

Tue, 2021-03-30 11:40

Just a quick note. I have an application that is generating thousands of inactive sessions and with the default dedicated server configuration we are having to add more and more memory to our virtual host to support the connections. We estimate that the application may need 45,000 mostly inactive sessions once the application is fully rolled out. So, I thought about how much memory would be required to support 45,000 sessions using shared servers. In an earlier post I mentioned how I got the sessions up to about 11,000 so I just took the Java program from that post and tried to adjust memory parameters to support over 45,000. I got it up to 0ver 60,000 so the test was essentially successful. I don’t think I would want to run a system with 60,000 sessions on a single node, but it is nice to see that it is to some degree possible.

I used a 64 gigabyte Linux VM and set these parameters:

sga_max_size=52G
sga_target=52G
shared_pool_size=36G
dispatchers='(PROTOCOL=TCP)(DISPATCHERS=64)'
max_shared_servers=16
shared_servers=16
large_pool_size=512M

Pretty sure that the large pool grew dynamically to fill the sga space not taken up by the shared pool. 52-36=16 gigabyte large pool.

Anyway, I don’t have time to write this up carefully now, but I wanted to publish the parameters.

Here is the previous post with the Java program I used to open 1000 connections:

https://www.bobbydurrettdba.com/2013/06/26/testing-maximum-number-of-oracle-sessions-supported-by-shared-servers/

I ended up running 30 of these on 3 servers for a total of 90,000 potential logins and got up to over 63,000.

Bobby

Categories: DBA Blogs

ORA-00904 DBMS_XPLAN HINT DBMS_LOB GRANT

Thu, 2021-03-18 18:06

I ran this query with a hint:

SQL> select /*+ full(my_tables) */ blocks
  2  from my_tables
  3  where
  4  owner = 'SYS' and
  5  table_name = 'TAB$';

    BLOCKS
----------
      1625

I ran this select to get the plan:

select * from table(dbms_xplan.display_cursor(null,null,'ALL'));

I was getting this error:

Column Projection Information (identified by operation id):
 1 - "BLOCKS"[NUMBER,22]
 ORA-00904: : invalid identifier

I found that my user or public needed an execute grant for DBMS_LOB to fix this:

SQL> grant execute on DBMS_LOB to PUBLIC;

Grant succeeded.

I am not sure why this grant was not in place on this database but it took a while to figure this out so I thought I would put it out there. I found the error in a trace and I suspected the issue was due to permissions. The trace was like:

PARSE ERROR ... err=904
 SELECT ST.* FROM XMLTABLE('/hint_usage/... DBMS_LOB.SUBSTR...

So that gave me the idea that I needed an execute grant on DBMS_LOB. EXECUTE ANY PROCEDURE did not do it.

After the grant it shows the hint report. This is on 19c:

Column Projection Information (identified by operation id):
 1 - "BLOCKS"[NUMBER,22]
 Hint Report (identified by...
 Total hints for statement: 1
 1 -  SEL$1 / MY_TABLES@SEL$1
            -  full(my_tables)

Bobby

P.S. Full log of the script that got the error:

Full log of the working script:

Full length trace lines:

Categories: DBA Blogs

Finding queries with bad plans from an AWR report of a load test

Fri, 2021-03-12 16:10

I want to document some recent steps that I have been taking to support new development on a transactional system. Every time the system has a new release, if that release includes Oracle SQL changes, I am asked to create and review an AWR report covering the time of a load test (usually several tests) and to see if I see any problems. In the past I looked for longer running application SQL but recently I changed to look at anything that averages over .1 seconds and that has been helpful. So, that is what this post is about. Obviously, if you have faster hardware or different workloads this rule of thumb will not help you. But maybe other higher-volume transactional systems will follow similar patterns.

Here is the top elapsed SQL from a recent load test:

SQL Ordered by Elapsed Time

I have only shown some of the columns to hide some details that I do not want to make public. Every SQL here whose “Elapsed Time per Exec (s)” value is above .1 seconds is not part of the application except the last one, 6kmnq0uj99a3c. This was a load test on a non-production system that ramped up a bunch of activity over several hours. This problem query only ran 664 times so if that is representative of how often it runs in production it may not really matter that it is inefficient. But you never know for sure, so I reviewed it anyway. All the queries listed that run in .03, .01, .02, and .00 seconds are representative of the typical queries with good plans on this system. So, that is why .1 ended up standing out. Also, not long ago I found two or three running in production with more than .1 seconds average runtime and they needed new indexes to bring them back under the .1 second threshold. So, for me .1 seconds is the current magical number.

To test it I used two of my scripts.

  • bind2.sql – to extract values used in one of the executions
  • test2.sql – to find out which table the execution spent the most time on

I replaced the bind variables with constants and ran the query in my test2.sql script and found that most of the time was on a certain range scan on what looked like the correct index. But on closer inspection I realized that a type conversion had prevented the last column of the index from being used. Here is what it looked like with the columns renamed to hide the production names.

filter((TO_NUMBER(MY_NUMBER)="XYZ"."MY_NUMBER" ...

One table has the “MY_NUMBER” column as a character string and the other as a number. So, it was doing a range scan and not a unique scan. I changed the query to convert the number to a character string and the plan used a unique scan.

ABC.MY_NUMBER= to_char(XYZ.MY_NUMBER)

Table ABC was the one that was doing a range scan on three columns and not on MY_NUMBER, the last column in the index. MY_NUMBER is a character column on ABC. XYZ was the other table with MY_NUMBER as a NUMBER type column. I am forcing the conversion of XYZ.MY_NUMBER to a character for the comparison instead of letting the optimizer choose to convert ABC.MY_NUMBER to a number which would suppress the use of the last column of the index on table ABC.

My point was not to talk about implicit type conversions preventing indexes from being used although that is very interesting. My point is that a plan like this that is pretty efficient could run less than .1 seconds if the index was used correctly. And if the application users end up scaling the use of this query way up to thousands or tens of thousands of executions per hour that unique scan could make a huge difference over the range scan without the last column of the index. Your CPUs might be 10 times faster than mine so your threshold might be lower than .1 seconds, but I think the idea is the same. There is some threshold that indicates a simple, frequently used, transactional SQL may not be using the right indexes. Does not apply in all cases but at the moment this is a useful rule of thumb for me.

I had just written the previous paragraphs before getting an email that our QA team had run another load test with the to_char explicit type conversion in place. It did not make as great of an improvement as I expected. Here are some edited outputs from my sqlstat.sql script:

Original query 6kmnq0uj99a3c:

END_INTERVAL_TIME     EXECUTIONS_DELTA Elapsed Average ms
 
 09-MAR-21 11.00.08 AM               79         171.306696
 09-MAR-21 12.00.35 PM               84         176.152667
 09-MAR-21 01.00.03 PM               80         178.420588
 09-MAR-21 02.00.32 PM               80         171.877913
 09-MAR-21 03.00.01 PM               81         174.509975
 09-MAR-21 04.00.29 PM               83         180.367157

New query 2ndfgypwp3qf0 with the to_char to allow the unique index scan:

END_INTERVAL_TIME     EXECUTIONS_DELTA Elapsed Average ms
 
 12-MAR-21 09.00.15 AM               80         107.822088
 12-MAR-21 10.00.44 AM               83         104.453446
 12-MAR-21 11.00.12 AM               81          105.34042
 12-MAR-21 12.00.42 PM               80          103.05625
 12-MAR-21 01.00.12 PM               79         106.738557
 12-MAR-21 02.00.42 PM               82         101.285183
 12-MAR-21 03.00.12 PM               81         105.172531

Kind of disappointing. I expected a greater improvement based on my testing. Still, .1 seconds per execution is better than .17. Maybe if the tables grow with more data over time this improvement will be greater.

Even though this query did not turn out to have a dramatic improvement I did find a way to improve the plan. My .1 seconds cutoff pointed me to a query that did not have the ideal use of indexes and lead to an improvement in performance. In other cases, in the past I have seen 20x improvements so it is worth reviewing the ones over .1 seconds.

Bobby

Categories: DBA Blogs

Simple tools I use

Thu, 2021-03-11 17:48

Just a quick note about two tools I use:

TextPad – my favorite text editor. I know everyone has their own, but this is mine.

UnixUtls – Unix like tools on a Windows command line.

Bobby

Categories: DBA Blogs

Wrapped Lines and Squished Pictures

Wed, 2021-03-10 17:11

I have been having trouble using WordPress on this blog. I keep having long lines wrapped instead of having a slider that readers can use to see the end of the lines. Also, pictures that looked fine when I posted them later look squished together. Yuck.

Long Lines

First, I will try to put some longer lines of output here using the preformatted type of block:

SQL_ID        PLAN_HASH_VALUE END_INTERVAL_TIME         EXECUTIONS_DELTA Elapsed Average ms CPU Average ms IO Average ms Cluster Average ms Application Average ms Concurrency Average ms Average buffer gets Average disk reads Average disk write megabytes Average rows processed
 
 6kmnq0uj99a3c        65249283 09-MAR-21 01.00.03.869 PM               80         178.420588        163.875             0                  0                      0                      0          13345.9375                  0                            0                  829.6
 6kmnq0uj99a3c        65249283 09-MAR-21 02.00.32.536 PM               80         171.877913        159.875             0                  0                      0                      0          13122.1375                  0                            0               816.0125
 6kmnq0uj99a3c        65249283 09-MAR-21 03.00.01.160 PM               81         174.509975     159.876543             0                  0                      0                      0          13145.2346                  0                            0             818.111111
 6kmnq0uj99a3c        65249283 09-MAR-21 04.00.29.556 PM               83         180.367157     164.939759             0                  0                      0                      0          13286.4337                  0                            0             825.843373
 6kmnq0uj99a3c        65249283 09-MAR-21 05.00.56.089 PM               40           26.11575           21.5        1.9689                  0                      0                      0               915.7              3.425                            0                     51

Notice how it wraps around and looks unreadable. I could swear that either a preformatted or a code block did not wrap in the recent past. Here is the same text in a code block:

SQL_ID        PLAN_HASH_VALUE END_INTERVAL_TIME         EXECUTIONS_DELTA Elapsed Average ms CPU Average ms IO Average ms Cluster Average ms Application Average ms Concurrency Average ms Average buffer gets Average disk reads Average disk write megabytes Average rows processed
------------- --------------- ------------------------- ---------------- ------------------ -------------- ------------- ------------------ ---------------------- ---------------------- ------------------- ------------------ ---------------------------- ----------------------
6kmnq0uj99a3c        65249283 09-MAR-21 01.00.03.869 PM               80         178.420588        163.875             0                  0                      0                      0          13345.9375                  0                            0                  829.6
6kmnq0uj99a3c        65249283 09-MAR-21 02.00.32.536 PM               80         171.877913        159.875             0                  0                      0                      0          13122.1375                  0                            0               816.0125
6kmnq0uj99a3c        65249283 09-MAR-21 03.00.01.160 PM               81         174.509975     159.876543             0                  0                      0                      0          13145.2346                  0                            0             818.111111
6kmnq0uj99a3c        65249283 09-MAR-21 04.00.29.556 PM               83         180.367157     164.939759             0                  0                      0                      0          13286.4337                  0                            0             825.843373
6kmnq0uj99a3c        65249283 09-MAR-21 05.00.56.089 PM               40           26.11575           21.5        1.9689                  0                      0                      0               915.7              3.425                            0                     51

Basically, the same problem although font and background are different. One thing I have done in the past is use a GitHub Gist. I would past the text into a gist and put the URL inline like this:

https://gist.github.com/bobbydurrett/792f10405a7c4c6acbf965abc31ad3c6

This no longer seems to work. I had to go back and change a bunch of posts with links like this to embed the gist in the posts. To do that I had an amusing set of steps:

  1. Create a new Paragraph block
  2. Add one space
  3. Choose Edit as HTML
  4. Paste in embedded gist between the <p> and </p>

Example of what I have to past in:

<p><script src="https://gist.github.com/bobbydurrett/792f10405a7c4c6acbf965abc31ad3c6.js"></script></p>

Here are the long lines as an embedded gist:

It would be great if there were a simpler way to do this. Maybe there is.

Pictures

The second challenge is that when I paste in screenshots, they get all squished. Here is a graphical version of the same type data:

The picture is not square, so it gets squished in. It is nice that you can click on it and see the big version, but I would like it to not be so ugly beforehand.

Thumbnail is 150 x 150 and very small.

75% is still squished

50% is not squished but the text is small. At least you can click on it and the big version pops up.

As I am writing this I realize there is a guide that you can click on to manually size the picture and it shows you have far to the right you can size it before it starts getting squished.

So, I guess for now I am stuck with either making my text lines short enough to fit or sticking them in a Gist. For images I just need to size them with the little tool to keep them within the margins, so they do not get pushed in to fit.

Bobby

Categories: DBA Blogs

DBVERIFY (dbv) outputs block_id for bigfiles

Wed, 2021-03-03 11:50

I posted a question about this and didn’t find an answer. I searched both Oracle’s support site and the internet in general. I ran the DBVERIFY utility dbv against a bigfile datafile that had corrupt blocks and wanted to relate the blocks back to a table using the DBA_EXTENTS view. For smallfile datafiles I could have used dbms_utility.data_block_address_block. But the manual says that it does not work with bigfiles. I did a small test and found that with bigfiles the address output by dbv is just the block_id within the data file. With a smallfile tablespace it was some combination of block_id and file_id. Really, it is more helpful for dbv to spit out the block_id if you are running it against a datafile because you already know which datafile you have. I will include some of the output of the test below.

Steps of my test:

  1. create a small bigfile tablespace
  2. create empty table nologging
  3. take a rman backup
  4. do a nologging update
  5. delete tablespace/datafile
  6. restore and recover it
  7. verify corruption exists
  8. run dbv to get DBA – block address
  9. run rman backup validate to get file id and block id
  10. select from dba_extents to get block locations for table
1 - create a small bigfile tablespace

create bigfile tablespace big datafile '/home/oracle/product/oradata/ORCL/big.dbf' size 10M;

[oracle@ora19 bigfiletests]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Mar 3 07:46:15 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.10.0.0.0

SQL> create bigfile tablespace big datafile '/home/oracle/product/oradata/ORCL/big.dbf' size 10M;

Tablespace created.


This was on a small 19c test database on a Linux VM.

2 - create empty table - maybe ctas and truncate select * from dba_tables;


ORCL:SYSTEM>create table bigfiletest tablespace big nologging as select * from dba_tables where 1=2;

Table created.

Had to create table with NOLOGGING to make the insert append below unrecoverable.

3 - take a rman backup
[oracle@ora19 ORCL]$ rman target /

Recovery Manager: Release 19.0.0.0.0 - Production on Wed Mar 3 08:11:29 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1534990846)

RMAN> backup database;

Starting backup at 03-MAR-21
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=37 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/home/oracle/product/oradata/ORCL/system01.dbf
input datafile file number=00003 name=/home/oracle/product/oradata/ORCL/sysaux01.dbf
input datafile file number=00004 name=/home/oracle/product/oradata/ORCL/undotbs01.dbf
input datafile file number=00005 name=/home/oracle/product/oradata/ORCL/big.dbf
input datafile file number=00007 name=/home/oracle/product/oradata/ORCL/users01.dbf
channel ORA_DISK_1: starting piece 1 at 03-MAR-21
channel ORA_DISK_1: finished piece 1 at 03-MAR-21
piece handle=/home/oracle/product/db/19.0.0/dbs/04voq09o_4_1_1 tag=TAG20210303T081136 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:02:05
Finished backup at 03-MAR-21

Starting Control File and SPFILE Autobackup at 03-MAR-21
piece handle=/home/oracle/product/db/19.0.0/dbs/c-1534990846-20210303-02 comment=NONE
Finished Control File and SPFILE Autobackup at 03-MAR-21

Just a regular backup before doing the unrecoverable insert append into nologging table.

4 - do a nologging update - insert append select * from dba_tables commit

ORCL:SYSTEM>insert /*+ append */ into bigfiletest
  2  select * from dba_tables where rownum < 2;

1 row created.

ORCL:SYSTEM>
ORCL:SYSTEM>commit;

Commit complete.

Just one row – should be one corrupt block.

5 - delete tablespace/datafile

[oracle@ora19 ORCL]$ cd /home/oracle/product/oradata/ORCL
[oracle@ora19 ORCL]$ ls -altr
total 2813440
drwxr-x---. 3 oracle oinstall         17 Jul 30  2019 ..
-rw-r-----. 1 oracle oinstall  209715712 Mar  3 07:50 redo02.log
-rw-r-----. 1 oracle oinstall  209715712 Mar  3 07:50 redo03.log
-rw-r-----. 1 oracle oinstall   57679872 Mar  3 07:51 temp01.dbf
drwxr-x---. 2 oracle oinstall       4096 Mar  3 08:02 .
-rw-r-----. 1 oracle oinstall 1142956032 Mar  3 08:11 system01.dbf
-rw-r-----. 1 oracle oinstall  692068352 Mar  3 08:11 sysaux01.dbf
-rw-r-----. 1 oracle oinstall  356524032 Mar  3 08:11 undotbs01.dbf
-rw-r-----. 1 oracle oinstall    5251072 Mar  3 08:11 users01.dbf
-rw-r-----. 1 oracle oinstall   10493952 Mar  3 08:14 big.dbf
-rw-r-----. 1 oracle oinstall  209715712 Mar  3 08:15 redo01.log
-rw-r-----. 1 oracle oinstall   10600448 Mar  3 08:15 control01.ctl
-rw-r-----. 1 oracle oinstall   10600448 Mar  3 08:15 control02.ctl
[oracle@ora19 ORCL]$ rm big.dbf

Now the unrecoverable nologging insert append change is lost. It is not on the backup and not on the redo or archived redo logs.

6 - restore and recover it

[oracle@ora19 ORCL]$ rman target /

Recovery Manager: Release 19.0.0.0.0 - Production on Wed Mar 3 08:16:07 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1534990846)

RMAN> alter tablespace big offline immediate;

using target database control file instead of recovery catalog
Statement processed

RMAN> restore tablespace big;

Starting restore at 03-MAR-21
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=37 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00005 to /home/oracle/product/oradata/ORCL/big.dbf
channel ORA_DISK_1: reading from backup piece /home/oracle/product/db/19.0.0/dbs/04voq09o_4_1_1
channel ORA_DISK_1: piece handle=/home/oracle/product/db/19.0.0/dbs/04voq09o_4_1_1 tag=TAG20210303T081136
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 03-MAR-21

RMAN> recover tablespace big;

Starting recover at 03-MAR-21
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:00:00

Finished recover at 03-MAR-21

RMAN> alter tablespace big online;

Statement processed

Simple tablespace restore and recovery. Had to alter tablespace offline immediate because the file was not there.

7 - verify corruption exists

ORCL:SYSTEM>select * from bigfiletest;
select * from bigfiletest
       *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 5, block # 787)
ORA-01110: data file 5: '/home/oracle/product/oradata/ORCL/big.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option

This just shows that the block is corrupt. It also gives us the file number (which we already knew) and the block id which would relate back to DBA_EXTENTS.

8 - run dbv to get DBAs - block addresses

dbv file=/home/oracle/product/oradata/ORCL/big.dbf blocksize=8192

[oracle@ora19 ORCL]$ dbv file=/home/oracle/product/oradata/ORCL/big.dbf blocksize=8192

DBVERIFY: Release 19.0.0.0.0 - Production on Wed Mar 3 08:21:45 2021

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /home/oracle/product/oradata/ORCL/big.dbf

DBV-00201: Block, DBA 787, marked corrupt for invalid redo application

This was kind of the moment of truth. The DBA from the DBVERIFY utility dbv was 787 which is the same as the block number in the error from the select.

9 - run rman backup validate to get file id and block ids

[oracle@ora19 ORCL]$ rman target /

Recovery Manager: Release 19.0.0.0.0 - Production on Wed Mar 3 08:30:47 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1534990846)

RMAN> BACKUP VALIDATE CHECK LOGICAL DATABASE;

...

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
5    OK     1              53           1280            2503068
  File Name: /home/oracle/product/oradata/ORCL/big.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              1
  Index      0              0
  Other      0              1226


ORCL:SYSTEM>select * from V$DATABASE_BLOCK_CORRUPTION;

no rows selected

ORCL:SYSTEM>
ORCL:SYSTEM>select * from V$NONLOGGED_BLOCK;

     FILE#     BLOCK#     BLOCKS NONLOGGED_START_CHANGE# NONLOGGED NONLOGGED_END_CHANGE# NONLOGGED RESETLOGS_CHANGE# RESETLOGS OBJECT#                                  REASON        CON_ID
---------- ---------- ---------- ----------------------- --------- --------------------- --------- ----------------- --------- ---------------------------------------- --------- ----------
         5        787          1                 2502865                         2502865                     1920977 02-MAY-19 74043                                    UNKNOWN            0

RMAN VALIDATE has the same block number – 787.

10 - select from dba_extents to get block locations for table

ORCL:SYSTEM>select * from dba_extents
  2  where segment_name='BIGFILETEST';

OWNER                                                                                                                            SEGMENT_NAME                                                                                                                     PARTITION_NAME                                                                                                                   SEGMENT_TYPE       TABLESPACE_NAME                 EXTENT_ID    FILE_ID   BLOCK_ID      BYTES     BLOCKS RELATIVE_FNO
-------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- ------------------ ------------------------------ ---------- ---------- ---------- ---------- ---------- ------------
SYSTEM                                                                                                                           BIGFILETEST                                                                                                                                                                                                                                                       TABLE              BIG                                     0          5        784      65536          8         1024

Seems to be the forth block. The extent starts at block 784 but block 787 is corrupt.

I had a larger test database with many corrupt blocks due to the way we had populated it with an RMAN restore and recover. I knew which table was corrupt because I ran select count(*) queries against every table on the database and only found one corrupt. Using the DBA value from dbv against the DBA_EXTENTS view for over 300 sample corrupt blocks that all pointed back to the table I knew was corrupt. I queried it like this:

SELECT 
tablespace_name, segment_type, owner, segment_name 
FROM my_extents 
WHERE file_id = 29 and 15340893 between block_id AND block_id + blocks - 1;

I created the my_extents table from dba_extents to speed up these queries:

create table my_extents as
select * from dba_extents;
create index my_extents_i1 on my_extents(block_id);

execute dbms_stats.gather_table_stats('MYOWNER','MY_EXTENTS');

Anyway, I do not know if this holds true in every situation, but it appears that the DBA values from dbv for bigfiles correspond to the block_id values in DBA_EXTENTS.

Bobby

Categories: DBA Blogs

$6 per month for blog on AWS

Tue, 2021-03-02 17:59

It looks like this blog is costing me about $6 per month on AWS which is cool. I was paying about $200/year or about $17 per month on iPage. I am not sure what I am missing. This blog is much faster on AWS even with a minimal size virtual machine.

I did pay a fixed, up-front $12 to switch my domain registrar to AWS and that is not included in the $6/month. The compute was $123.97 including tax for 3 years or 36 months. That buys use one 2.5 gigahertz processor and 1 gig of memory. Disk is extra as is a couple of other things. The extras were $2.27 for February which of course only has 28 days. If the site were to get busy, then the monthly price would go up but that seems unlikely.

Not sure what I am missing. iPage did more for you and could be used by a non-technical person but if you can handle the technical part AWS seems better and cheaper.

Bobby

Categories: DBA Blogs

Created MySQL Scripts Repository

Fri, 2021-02-26 12:30

Most of my work is with Oracle databases but I have been starting to learn a little bit about MySQL. Where I work, we have a bunch of on premises Oracle databases including large and active ones with various challenging performance tuning problems that I enjoy working on. But for the last couple or so years we have been using MySQL on RDS within Amazon’s cloud (AWS). I have not had many chances to work on a MySQL performance issue, but I want to be ready when the time comes. Back in 2019 I read several chapters of the MySQL 5.7 reference manual and started looking at tuning. Last week I picked it up again. I was mainly looking at Amazon’s Performance Insights that we have running on our RDS databases and seeing what information it gave us and how to link that back to other tables/views that MySQL uses for performance. Anyway, I have made notes on our internal web pages, but I also decided to create a public GitHub repository with some of the SQL and Python scripts that I have created so far. It is very basic, and I am sure that other people have more sophisticated tools, but I think it is helpful to me to create the repository and to update it as I learn more. So, if you are interested here it is:

https://github.com/bobbydurrett/MySQL-Scripts

Bobby

Categories: DBA Blogs

Updated Centos VM to Oracle Linux

Sun, 2021-01-31 12:59

I was using Centos Linux for Nethack programming and playing on my home laptop. Evidently Centos is going away at the end of 2021 and Tim Hall had a post about moving Centos to Oracle Linux. So, I did it. The theory was that I would still be able to get updates from Oracle in 2022 and beyond.

I followed the instructions from here: https://linux.oracle.com/switch/centos/

I did this while logged in as root:

git clone https://github.com/oracle/centos2ol
cd centos2ol
bash centos2ol.sh

I got this error:

Error: Package: tkinter-2.7.5-88.el7.x86_64 (@base/7)
           Requires: python = 2.7.5-88.el7

I do not need tkinter so I just removed it:

yum remove tkinter-2.7.5-88.el7.x86_64

Then I just did a yum update and it seemed fine. The only odd thing is that when I reboot, I get an error like this:

integrity: Unable to open file: /etc/keys/x509_ima.der (-2)

I looked up this message and it seems like I cannot fix it but can just ignore it. Not a big deal but would be cleaner without it.

It was easy to switch but I just thought I would document the two error messages, even though they did not cause any problems.

Bobby

Categories: DBA Blogs

Pages