2013年8月30日星期五

揪出驱动卸载而没有释放的内存

因为以前在工程中解决过,在实际中经常发生,所以有此文,以备终结和以后使用.

先来一个示例的工程,代码如下:
#include <ntifs.h>

#define TAG 'tset' //本驱动在内存中的标志,即test.

DRIVER_UNLOAD Unload;
VOID Unload(__in PDRIVER_OBJECT DriverObject)
{  

}

#pragma INITCODE
DRIVER_INITIALIZE DriverEntry;
NTSTATUS DriverEntry(__in struct _DRIVER_OBJECT * DriverObject, __in PUNICODE_STRING RegistryPath)
{
    NTSTATUS status = STATUS_SUCCESS ;
    PVOID p = 0;

    KdBreakPoint();// == DbgBreakPoint()        

    p = ExAllocatePoolWithTag(NonPagedPool, 9, TAG);
    if (p == NULL) {
        status = STATUS_UNSUCCESSFUL;
    }  

    DriverObject->DriverUnload = Unload;  
   
    return status;
}

上面的代码很简单,相信大家都能看出来,如果加载成功将会出现本文的话题.
可是实际上的工程很复杂,可能有几十个文件,每个文件几千行代码.

现在要做的是:即使有代码,先无视代码,或者把代码看的很复杂.

1.加载驱动.
2.启动和配置驱动程序验证程序管理器,这一步一定要配置池跟踪选项,
  详细的可以参考:http://bbs.pediy.com/showthread.php?t=156804
3.重启计算机,重启之前可以查看一下自己刚才的配置.
4.再次加载驱动.
5.如果加载成功,就开始卸载.这时最好双击调试链接上.
  卸载完毕,即Unload函数的代码运行完毕之后,出现蓝屏.以下分析估计也可以适用于内存转储文件.
  蓝屏内容如下:
0: kd> g

*** Fatal System Error: 0x000000c4
                       (0x0000000000000062,0xFFFFFA80059C5C88,0xFFFFFA8004F5A010,0x0000000000000001)

Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

Connected to Windows 7 7601 x64 target at (Sat Jul 13 10:17:30.820 2013 (UTC + 8:00)), ptr64 TRUE
Loading Kernel Symbols
...............................................................
...................................

Press ctrl-c (cdb, kd, ntsd) or ctrl-break (windbg) to abort symbol loads that take too long.
Run !sym noisy before .reload to track down problems loading symbols.

.............................
........
Loading User Symbols
.................................
Loading unloaded module list
.....Unable to enumerate user-mode unloaded modules, Win32 error 0n30
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck C4, {62, fffffa80059c5c88, fffffa8004f5a010, 1}

Probably caused by : test.sys

Followup: MachineOwner
---------

nt!DbgBreakPointWithStatus:
fffff800`0167bb90 cc              int     3


按照提示运行!analyze -v命令.
0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_VERIFIER_DETECTED_VIOLATION (c4)
A device driver attempting to corrupt the system has been caught.  This is
because the driver was specified in the registry as being suspect (by the
administrator) and the kernel has enabled substantial checking of this driver.
If the driver attempts to corrupt the system, bugchecks 0xC4, 0xC1 and 0xA will
be among the most commonly seen crashes.
Arguments:
Arg1: 0000000000000062, A driver has forgotten to free its pool allocations prior to unloading.
Arg2: fffffa80059c5c88, name of the driver having the issue.
Arg3: fffffa8004f5a010, verifier internal structure with driver information.
Arg4: 0000000000000001, total # of (paged+nonpaged) allocations that weren't freed.
  Type !verifier 3 drivername.sys for info on the allocations
  that were leaked that caused the bugcheck.

Debugging Details:
------------------


BUGCHECK_STR:  0xc4_62

IMAGE_NAME:  test.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  51e0b488

MODULE_NAME: test

FAULTING_MODULE: fffff88004c72000 test

VERIFIER_DRIVER_ENTRY: dt nt!_MI_VERIFIER_DRIVER_ENTRY fffffa8004f5a010
Symbol nt!_MI_VERIFIER_DRIVER_ENTRY not found.

DEFAULT_BUCKET_ID:  WIN7_DRIVER_FAULT

PROCESS_NAME:  services.exe

CURRENT_IRQL:  2

LAST_CONTROL_TRANSFER:  from fffff8000176e212 to fffff8000167bb90

STACK_TEXT:
fffff880`03f31c98 fffff800`0176e212 : 00000000`00000062 fffffa80`054d28f0 00000000`00000065 fffff800`016beb78 : nt!DbgBreakPointWithStatus
fffff880`03f31ca0 fffff800`0176effe : fffffa80`00000003 00000000`00000000 fffff800`016bf3d0 00000000`000000c4 : nt!KiBugCheckDebugBreak+0x12
fffff880`03f31d00 fffff800`01683e44 : 00000000`00000000 fffff880`04c79000 00000000`00000003 00000000`00000000 : nt!KeBugCheck2+0x71e
fffff880`03f323d0 fffff800`01b0c3dc : 00000000`000000c4 00000000`00000062 fffffa80`059c5c88 fffffa80`04f5a010 : nt!KeBugCheckEx+0x104
fffff880`03f32410 fffff800`01b1b54a : 00000000`00000001 00000000`00000000 fffff880`04c72000 00000000`00000001 : nt!VerifierBugCheckIfAppropriate+0x3c
fffff880`03f32450 fffff800`0176fa70 : 00000000`00000000 00000000`00000000 fffff800`017f8e80 00000000`00000000 : nt!VfPoolCheckForLeaks+0x4a
fffff880`03f32490 fffff800`01a352de : fffffa80`059c5bd0 00000000`00000000 00000000`00000000 00000000`00000000 : nt!VfTargetDriversRemove+0x160
fffff880`03f32530 fffff800`01a59d33 : 00000000`00000000 00000000`000e0082 00000000`00000000 00000000`00000001 : nt!VfDriverUnloadImage+0x2e
fffff880`03f32560 fffff800`01a5a1ad : 00000000`00000000 fffffa80`059c5bd0 00000000`00000000 00000000`00010200 : nt!MiUnloadSystemImage+0x283
fffff880`03f325d0 fffff800`01afb6e1 : 00000000`00000000 fffff880`03f328f0 fffffa80`03cfb180 00000000`00000018 : nt!MmUnloadSystemImage+0x4d
fffff880`03f32610 fffff800`0168d004 : 00000000`00000000 fffff880`03f328f0 fffffa80`03cfb180 fffff880`03f328f0 : nt!IopDeleteDriver+0x41
fffff880`03f32640 fffff800`01a6ba3e : fffff880`03f328f0 00000000`00000000 00000000`c0000001 fffff800`00000000 : nt!ObfDereferenceObject+0xd4
fffff880`03f326a0 fffff800`01682fd3 : fffffa80`054d28f0 fffff880`03f328c0 00000000`00000001 fffff980`01450000 : nt!IopUnloadDriver+0x45c
fffff880`03f32870 fffff800`0167f570 : fffff800`01a6b737 00000000`0104e820 00000000`00000001 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
fffff880`03f32a08 fffff800`01a6b737 : 00000000`0104e820 00000000`00000001 00000000`00000000 00000000`00cb8f20 : nt!KiServiceLinkage
fffff880`03f32a10 fffff800`01682fd3 : fffffa80`054d28f0 fffff880`03f32c60 00000000`00000000 00000000`00000000 : nt!IopUnloadDriver+0x155
fffff880`03f32be0 00000000`779a2b8a : 00000000`ffb75879 00000000`00341460 00000000`00341460 00000000`00000334 : nt!KiSystemServiceCopyEnd+0x13
00000000`0104e7f8 00000000`ffb75879 : 00000000`00341460 00000000`00341460 00000000`00000334 00000000`0000000a : ntdll!NtUnloadDriver+0xa
00000000`0104e800 00000000`ffb7575a : 00000000`0000000a 00000000`00000000 00000000`0104e970 00000000`00000334 : services!ScUnloadDriver+0xa9
00000000`0104e840 00000000`ffb6118b : 00000000`00000000 00000000`00000020 00000000`00000001 00000000`00240022 : services!ScControlDriver+0x112
00000000`0104e870 00000000`ffb4907b : 00000000`00000001 00000000`0104ede0 00000000`00000000 000007fe`00000000 : services!ScControlService+0x192
00000000`0104e940 000007fe`ff5a23d5 : 00000000`00000003 00000000`0104edf8 00000000`0104ef40 000007fe`ff5a2396 : services!RControlService+0x4b
00000000`0104e9b0 000007fe`ff5969b2 : 00000000`0104ede0 00000000`ffb80192 00000000`00000020 00000000`ffb7ed54 : RPCRT4!Invoke+0x65
00000000`0104ea10 000007fe`ff59338d : 00000000`0104f120 00000000`00000000 00000000`00000000 00000000`00000000 : RPCRT4!NdrStubCall2+0x32a
00000000`0104f030 000007fe`ff5950f4 : 00000000`00000000 00000000`00000001 00000000`00000000 00000000`003ea9c0 : RPCRT4!NdrServerCall2+0x1d
00000000`0104f060 000007fe`ff594f56 : 00000000`00000000 00000000`003e1680 00000000`0104f210 00000000`003d2090 : RPCRT4!DispatchToStubInCNoAvrf+0x14
00000000`0104f090 000007fe`ff595679 : 00000000`00361878 000007fe`ff58a9df 00000000`00000308 00000000`003ea870 : RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x146
00000000`0104f1b0 000007fe`ff59532d : 00000000`00369a00 00000000`0125e8d0 000007fe`ff570000 00000000`0034e920 : RPCRT4!LRPC_SCALL::DispatchRequest+0x149
00000000`0104f290 000007fe`ff5b2e7f : 00000000`00000000 00000000`0034a820 00000000`00000000 00000000`00000001 : RPCRT4!LRPC_SCALL::HandleRequest+0x20d
00000000`0104f3c0 000007fe`ff5b2a35 : 00000000`ffb57f48 00000000`00000000 00000000`0034a920 00000000`00000000 : RPCRT4!LRPC_ADDRESS::ProcessIO+0x3bf
00000000`0104f500 00000000`7796b68b : 00000000`00000048 00000000`00000001 00000000`00000000 00000000`00000000 : RPCRT4!LrpcIoComplete+0xa5
00000000`0104f590 00000000`7796feff : 00000000`00000000 00000000`00000000 00000000`0000ffff 00000000`00000000 : ntdll!TppAlpcpExecuteCallback+0x26b
00000000`0104f620 00000000`7774652d : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!TppWorkerThread+0x3f8
00000000`0104f920 00000000`7797c521 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd
00000000`0104f950 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d


STACK_COMMAND:  kb

FOLLOWUP_NAME:  MachineOwner

FAILURE_BUCKET_ID:  X64_0xc4_62_VRF_LEAKED_POOL_IMAGE_test.sys

BUCKET_ID:  X64_0xc4_62_VRF_LEAKED_POOL_IMAGE_test.sys

Followup: MachineOwner
---------

上面说:
Arg2: fffffa80059c5c88, name of the driver having the issue.
输入:db fffffa80059c5c88
0: kd> db fffffa80059c5c88
fffffa80`059c5c88  74 00 65 00 73 00 74 00-2e 00 73 00 79 00 73 00  t.e.s.t...s.y.s.
fffffa80`059c5c98  00 00 00 00 00 00 00 00-0e 00 36 02 43 63 56 70  ..........6.CcVp
fffffa80`059c5ca8  60 5b 9c 05 80 fa ff ff-a0 b2 cb 03 80 fa ff ff  `[..............
fffffa80`059c5cb8  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
fffffa80`059c5cc8  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
fffffa80`059c5cd8  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
fffffa80`059c5ce8  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
fffffa80`059c5cf8  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................
果然不错.

再看:Arg3: fffffa8004f5a010, verifier internal structure with driver information.
估计这是一个结构,啥结构暂时没有深入查看.
不过windbg帮助文档说这个参数:Reserved.

再看:Arg4: 0000000000000001, total # of (paged+nonpaged) allocations that weren't freed.
这个是申请而没有释放的次数(包含分页的和不分页的),不是内存大小.

接下来就是乖乖的运行:
Type !verifier 3 drivername.sys for info on the allocations
  that were leaked that caused the bugcheck.
的提示.
0: kd> !verifier 3 test.sys

Verify Level fbf ... enabled options are:
  Special pool
  Special irql
  Inject random low-resource API failures
  All pool allocations checked on unload
  Io subsystem checking enabled
  Deadlock detection enabled
  DMA checking enabled
  Security checks enabled
  Force pending I/O requests
  IRP Logging
  Miscellaneous checks enabled

Summary of All Verifier Statistics

RaiseIrqls                             0x0
AcquireSpinLocks                       0x0
Synch Executions                       0x0
Trims                                  0x24d9

Pool Allocations Attempted             0x2757b
Pool Allocations Succeeded             0x2757b
Pool Allocations Succeeded SpecialPool 0x2757b
Pool Allocations With NO TAG           0x0
Pool Allocations Failed                0x0
Resource Allocations Failed Deliberately   0x0

Current paged pool allocations         0x0 for 00000000 bytes
Peak paged pool allocations            0x0 for 00000000 bytes
Current nonpaged pool allocations      0x1 for 0000000C bytes
Peak nonpaged pool allocations         0x1 for 0000000C bytes

Driver Verification List

Entry     State           NonPagedPool   PagedPool   Module

fffffa8003c2f740 Loaded           0000000c       00000000    test.sys

Current Pool Allocations  00000001    00000000
Current Pool Bytes        0000000c    00000000
Peak Pool Allocations     00000001    00000000
Peak Pool Bytes           0000000c    00000000

PoolAddress  SizeInBytes    Tag       CallersAddress
fffff9800933eff0     0x0000000c     test      fffff88004c73043


可以看到:
Peak Pool Allocations     00000001    00000000
Peak Pool Bytes           0000000c    00000000

PoolAddress  SizeInBytes    Tag       CallersAddress
fffff9800933eff0     0x0000000c     test      fffff88004c73043

关于内存标志的提示:
有的标志本驱动没有定义,却出现在本驱动中,这些标志是系统定义的,使用不当造成的.
例如:RtlConvertSidToUnicodeString,RtlAnsiStringToUnicodeString,RtlUnicodeStringToAnsiString等.
不过这些内存的标记,我忘记了.

这里只做非分页的内存泄漏,没有做分页内存的演示.
有一次没有释放,申请的内存的大小是12.

看一下:
0: kd> db fffff9800933eff0
fffff980`0933eff0  d7 d7 d7 d7 d7 d7 d7 d7-d7 d7 d7 d7 d7 d7 d7 d7  ................
fffff980`0933f000  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
fffff980`0933f010  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
fffff980`0933f020  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
fffff980`0933f030  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
fffff980`0933f040  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
fffff980`0933f050  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
fffff980`0933f060  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
好像是16个字节.

0: kd> u fffff88004c73043
test!DriverEntry+0x33 [e:\driver\test\test.c @ 20]:
fffff880`04c73043 4889442428      mov     qword ptr [rsp+28h],rax
fffff880`04c73048 48837c242800    cmp     qword ptr [rsp+28h],0
fffff880`04c7304e 7508            jne     test!DriverEntry+0x48 (fffff880`04c73058)
fffff880`04c73050 c7442420010000c0 mov     dword ptr [rsp+20h],0C0000001h
fffff880`04c73058 488b442440      mov     rax,qword ptr [rsp+40h]
fffff880`04c7305d 488d0d1c000000  lea     rcx,[test!Unload (fffff880`04c73080)]
fffff880`04c73064 48894868        mov     qword ptr [rax+68h],rcx
fffff880`04c73068 8b442420        mov     eax,dword ptr [rsp+20h]

注意这一步要正确的加载符号文件.
看到了吧!
就是e:\driver\test\test.c文件的第20行申请的内存没有释放.

至此分析完毕,具体的解决办法我就不说了,相信你是知道的.

总结和心得:
解决问题,需要思路和方法,方法和思路是动脑子想出来的,我无人可问,其次是查询和搜索.
如果没有思路和方法,无头苍蝇的乱转,转了很长时间是没有收获和进展的.
如果你会了,其实这也很简单的.

关于没有源码的驱动卸载后仍在系统中的内存另论.
关于运行过程中的内核的内存泄漏,可以参考:http://bbs.pediy.com/showthread.php?t=154015


made by correy
made at 2013.07.13
email:kouleguan at hotmail dot com
homepage:http://correy.webs.com

书于匆忙之中,如果不当之处,敬请指出.


2013.11.19补充:
权威的参照:
http://msdn.microsoft.com/en-us/library/windows/hardware/dn457995(v=vs.85).aspx
也就是多两个命令:
对调用的地址运行:ln 0xXXXXXXXXXXX
对申请到的内存地址运行:!verifier 0x80 0xXXXXXXXXXXX
建议:如果不会用!verifier 0x80命令,不要对ExAllocatePoolWithTag函数进行封装.
             会用!verifier 0x80命令,可以对ExAllocatePoolWithTag函数进行封装.

没有评论:

发表评论