Hi Everyone!!
Recently there was a outage on one of the financials 11i production system for couple of hours. Applications support team received bunch of site scope alarms saying unable to do synthetic transactions on finprod which was running on 11i EBS with 10g DB. Production red team call got started, people tried to reproduce the issue, Login page didn't come up.
Troubleshooting was attempted by tier 2 DBA's. they tried basic sqlplus apps account connections, jvm logs/alert log verification etc, didn't find anything odd. All basic check alert log, jvm log etc looked good from their perspective. DBA's pushed to network team for any connection issues to see anything odd observed from load balancer, reverse proxies.
Meanwhile escalation came to tier 3 as well.
I had to jump in and see what the system is doing, though my basic tests were looking okay, huge library cache pin locks on the DB was evident. OEM was showing high concurrency wait event as well.
SQL> select s.inst_id Inst,s.p1,s.p1raw,s.p2,s.p2raw,s.p3,s.p3raw,
sid Holder,
2 3 KGLPNMOD Held,
4 KGLPNREQ Req
5 from x$kglpn , gv$session s
6 where KGLPNHDL in (select p1raw from gv$session_wait
7 where wait_time=0 and event like 'library cache pin%')
8 and KGLPNMOD <> 0
9 and s.saddr=x$kglpn.kglpnuse;
INST P1 P1RAW P2 P2RAW P3 P3RAW HOLDER HELD REQ
---------- ---------- ---------------- ---------- ---------------- ---------- ---------------- ---------- ---------- -------
1 2.2095E+10 0000000524FCEEF8 1.8908E+10 000000046708A190 201 00000000000000C9 700 2 0
1 2.2095E+10 0000000524FCEEF8 2.0967E+10 00000004E1B6F448 201 00000000000000C9 2082 2 0
1 1232 00000000000004D0 317900 000000000004D9CC 8 0000000000000008 2310 2 0
1 2.2095E+10 0000000524FCEEF8 1.9799E+10 000000049C164D08 201 00000000000000C9 3004 2 0
1 2.2095E+10 0000000524FCEEF8 1.8388E+10 000000044808B820 201 00000000000000C9 3648 2 0
1 2.2095E+10 0000000524FCEEF8 1.9397E+10 0000000484284D90 201 00000000000000C9 3688 2 0
1 2.2095E+10 0000000524FCEEF8 1.9585E+10 000000048F553A28 201 00000000000000C9 3867 2 0
1 2.2095E+10 0000000524FCEEF8 1.9597E+10 000000049014AC20 201 00000000000000C9 4450 2 0
8 rows selected.
SQL>
SQL> SELECT kglnaown "Owner", kglnaobj "Object"
FROM x$kglob
WHERE 2 3 kglhdadr='0000000524FCEEF8';
Owner Object
---------------------------------------------------------------- ------------------------------
SYS DBMS_UTILITY
SQL>
SQL> l
1 SELECT
2 B.SID,a.KGLPNMOD, a.KGLPNREQ, b.username, c.KGLNAOBJ, c.KGLOBTYP FROM
3 x$kglpn a,
4 v$session b,
5 x$kglob c
6 WHERE
7 a.KGLPNUSE = b.saddr and c.kglnaobj='DBMS_UTILITY' AND
8* a.KGLPNHDL = c.KGLHDADR
SQL> /
SID KGLPNMOD KGLPNREQ USERNAME KGLNAOBJ KGLOBTYP
---------- ---------- ---------- ------------------------------ ------------------------------ ----------
400 0 2 SOAUSER0 DBMS_UTILITY 9
419 0 2 SOAUSER DBMS_UTILITY 9
449 0 2 APPS DBMS_UTILITY 9
457 0 2 APPS DBMS_UTILITY 9
..
..
..
.. 462 0 2 APPS DBMS_UTILITY 9
466 0 2 APPS DBMS_UTILITY 9
5316 0 2 APPS DBMS_UTILITY 9
5332 0 2 APPS DBMS_UTILITY 9
5336 0 3 SYS DBMS_UTILITY 9
5376 0 2 APPS DBMS_UTILITY 9
5401 0 2 APPS DBMS_UTILITY 9
5436 0 2 ORAWEB DBMS_UTILITY 9
5751 0 2 APPS DBMS_UTILITY 9
5787 0 2 APPS DBMS_UTILITY 9
5823 2 0 APPS DBMS_UTILITY 5
I have truncated the output for readability purposes. Output was showing different numbers at different time ranging 230 rows to 325 rows.
Similar post i have made earlier
http://applicationsdba.blogspot.com/2010/11/troubleshooting-library-cache-pin-wait.html
After closely looking output, one of the session launched by 'SYS' user showing the mode requested as 3 - which is exclusive.
I figured out the full session details and found that one of the DBA is running 'REVOKE ' script on DBMS_UTILITY package from OPS$ users.(app support accounts). I went ahead and killed the session. System came back to normal. phew!!
Tier2 DBA's have been advised to run any scripts in production after business hours to reduce the impact of the issue.
Hope this will help some one.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment