Stop error in WFP on Server 2008 beta3

I was just testing my code on the Server 2008 beta3, and got a stop error. I attached a kernel debugger and got the following details, but I do not see my kernel code in the call stack. Can you tell if this is due to something I am doing wrong in my code, or is this a bug in beta3?

Thanks,

Jeremy

Code Snippet

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 00000004, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: 8203083f, address which referenced memory

Debugging Details:


READ_ADDRESS: 00000004

CURRENT_IRQL: 2

FAULTING_IP:
NETIO!WfpRefreshEntryLru+12
8203083f 3b4704 cmp eax,dword ptr [edi+4]

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

BUGCHECK_STR: 0xD1

PROCESS_NAME: svchost.exe

TRAP_FRAME: 8dc9dd30 -- (.trap 0xffffffff8dc9dd30)
ErrCode = 00000000
eax=00000000 ebx=88966ec0 ecx=00000009 edx=88966f01 esi=88966f64 edi=00000000
eip=8203083f esp=8dc9dda4 ebp=8dc9ddb8 iopl=0 nv up ei ng nz ac pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010296
NETIO!WfpRefreshEntryLru+0x12:
8203083f 3b4704 cmp eax,dword ptr [edi+4] ds:0023:00000004=?
Resetting default scope

LAST_CONTROL_TRANSFER: from 818aa26f to 818340a8

STACK_TEXT:
8dc9d914 818aa26f 00000003 fc2318a4 00000000 nt!RtlpBreakWithStatusInstruction
8dc9d964 818aace8 00000003 00000004 8203083f nt!KiBugCheckDebugBreak+0x1c
8dc9dd10 81847ec8 0000000a 00000004 00000002 nt!KeBugCheck2+0x600
8dc9dd10 8203083f 0000000a 00000004 00000002 nt!KiTrap0E+0x2ac
8dc9ddb8 8236ff2c 88966f64 00000000 00000000 NETIO!WfpRefreshEntryLru+0x12
8dc9df2c 82364f0a 8e217d90 8e210002 88966ec0 tcpip!WfpAleInsertRemoteEndpoint+0x25e
8dc9e1b8 82365525 8e217d90 00000030 823cf2e8 tcpip!WfpAlepAuthorizeSend+0x603
8dc9e304 8237e9e1 00000030 8e210002 00000011 tcpip!WfpAleAuthorizeSend+0x1c8
8dc9e358 8237e176 8e217d90 8e210002 00000011 tcpip!WfpAleConnectAcceptIndicate+0x4c
8dc9e3c8 8236b1db 8dc9e590 00000011 8e210002 tcpip!ProcessALEForTransportPacket+0xf2
8dc9e448 82383f8c 8dc9e590 00000011 8e210002 tcpip!ProcessAleForNonTcpOut+0x5b
8dc9e584 82384267 00000011 8e217d90 000017c0 tcpip!WfpProcessOutTransportStackIndication+0x1d8
8dc9e5f8 82384aba 00000000 8459d0f8 8dc9e758 tcpip!IppInspectLocalDatagramsOut+0xb2
8dc9e798 82387027 00000000 00000004 823d6bc0 tcpip!IppSendDatagramsCommon+0x51b
8dc9e7b8 8236c36d 84586830 8dc9e8b4 8e217ef0 tcpip!IpNlpSendDatagrams+0x4b
8dc9ea00 88fb86e8 8895e578 8458a938 8e220940 tcpip!UdpSendMessages+0xb9f
8dc9ea18 88fd5d7c 8e217ef0 8dc9ea30 8e217ef0 afd!AfdTLSendMessages+0x27
8dc9ea68 88fd5a7a 8e217ef0 8e220940 05357cc9 afd!AfdTLFastDgramSend+0x7d
8dc9eae4 88fcb3d0 8e217ef0 8dc9eb7c 00000023 afd!AfdFastDatagramSend+0x5ac
8dc9ec58 819cae0b 8e21be01 00000001 0307ed88 afd!AfdFastIoDeviceControl+0x3c1
8dc9ed00 81a1fc70 88897300 0000061c 00000000 nt!IopXxxControlFile+0x2cf
8dc9ed34 81844bba 00000620 0000061c 00000000 nt!NtDeviceIoControlFile+0x2a
8dc9ed34 76e4d144 00000620 0000061c 00000000 nt!KiFastCallEntry+0x12a
0307ed3c 76e4bb08 74c75514 00000620 0000061c ntdll!KiFastSystemCallRet
0307ed40 74c75514 00000620 0000061c 00000000 ntdll!ZwDeviceIoControlFile+0xc
0307ee40 76f13e5e 00000620 0307ee80 00000001 mswsock!WSPSendTo+0x254
0307ee90 751599f9 00000620 032d1d1c 00000023 WS2_32!sendto+0x8c
0307eec0 751598f0 00000000 00000023 00000000 DNSAPI!Send_MessagePrivateEx+0x194
0307eee4 75159869 032d1c48 032d1698 032d16c8 DNSAPI!sendUsingServerInfo+0x37
0307ef20 75158e5f 032d1c48 00000000 00000001 DNSAPI!sendUdpToNextDnsServers+0x131
0307f018 75158bf4 00000000 032d1f28 06006000 DNSAPI!Send_AndRecvUdpWithParam+0x197
0307f0c4 75159227 032d1698 0000232b 0307f438 DNSAPI!Send_AndRecv+0x95
0307f14c 751590e3 0307f438 000025e5 0307f438 DNSAPI!Query_Wire+0xed
0307f164 75157609 06006000 00000000 0307f438 DNSAPI!Query_SingleNamePrivate+0x83
0307f3ac 75157886 0307f438 0307f3dc 75156ec4 DNSAPI!Query_SingleNameDualAddr+0x86
0307f3b8 75156ec4 0307f438 00001000 0307f438 DNSAPI!Query_SingleName+0x16
0307f3dc 75156df8 00000001 032d1698 0307f438 DNSAPI!Query_AllNames+0xab
0307f400 73c331fd 00000100 06006000 001ef634 DNSAPI!Query_Main+0x7b
0307f418 73c3331d 0307f438 0307f894 0307fabc dnsrslvr!ResolverQuery+0x3b
0307f890 75f25cc7 00000000 001ef634 0000001c dnsrslvr!R_ResolverQuery+0xf6
0307f8b8 75fa130f 73c3322b 0307fac0 00000005 RPCRT4!Invoke+0x2a
0307fce4 75fa0482 00000000 00000000 00218f40 RPCRT4!NdrStubCall2+0x27b
0307fd00 75f2630e 00218f40 ac8909fe 00218f40 RPCRT4!NdrServerCall2+0x1e
0307fd3c 75f261bc 73c3184e 00218f40 0307fde0 RPCRT4!DispatchToStubInCNoAvrf+0x41
0307fdb0 75f26652 00000000 00000000 00000007 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0xdf
0307fdf0 75f2651d 00218ed0 0307fe2c 00218ed0 RPCRT4!LRPC_SCALL::DispatchRequest+0xa2
0307fe5c 75f251bb 00218ed0 001ef5c8 001ee4c0 RPCRT4!LRPC_SCALL::HandleRequest+0x1d2
0307ff04 75f38264 00000000 001727c0 001cfbe8 RPCRT4!LRPC_ADDRESS::ProcessIO+0x214
0307ff58 75f37e55 0014d8f8 0307ff88 75f37e1b RPCRT4!LOADABLE_TRANSPORT::ProcessIOEvents+0x162
0307ff64 75f37e1b 0014d8f8 00000000 00000000 RPCRT4!ProcessIOEventsWrapper+0xe
0307ff88 75f37e81 001cfbe8 0307ffa0 76c23695 RPCRT4!BaseCachedThreadRoutine+0x5c
0307ff94 76c23695 001727c0 0307ffe0 76e2a3c4 RPCRT4!ThreadStartRoutine+0x1e
0307ffa0 76e2a3c4 001727c0 75ec2145 00000000 kernel32!BaseThreadInitThunk+0xe
0307ffe0 00000000 75f37e63 001727c0 00000000 ntdll!_RtlUserThreadStart+0x23


STACK_COMMAND: kb

FOLLOWUP_IP:
NETIO!WfpRefreshEntryLru+12
8203083f 3b4704 cmp eax,dword ptr [edi+4]

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: NETIO!WfpRefreshEntryLru+12

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: NETIO

IMAGE_NAME: NETIO.SYS

DEBUG_FLR_IMAGE_TIMESTAMP: 462595db

FAILURE_BUCKET_ID: 0xD1_NETIO!WfpRefreshEntryLru+12

BUCKET_ID: 0xD1_NETIO!WfpRefreshEntryLru+12

Followup: MachineOwner

[6788 byte] By [JeremyDrake] at [2008-1-10]
# 1

Are you seeing this consistently?

What does the driver do during ALE_AUTH_CONNECT for UDP traffic?

Thanks,

Biao.W.

BiaoWang[MSFT] at 2007-10-3 > top of Msdn Tech,Windows Networking Development,Windows Filtering Platform (WFP)...
# 2

Yes, consistently.

The simple answer to what the callout does during ALE_AUTH_CONNECT_V4 is that it can either call FwpsPendOperation0, in which case it does:

Code Snippet

classifyOut->actionType = FWP_ACTION_BLOCK;

classifyOut->rights &= ~FWPS_RIGHT_ACTION_WRITE;

classifyOut->flags |= FWPS_CLASSIFY_OUT_FLAG_ABSORB;

Otherwise it can return a normal FWP_ACTION_BLOCK (clearing the RIGHT_ACTION_WRITE, but not setting the ABSORB flag), or FWP_ACTION_CONTINUE (doing none of the above).

In the last instance of the callout invoked, it seems to be doing the first option: pending the operation.

However, in reality things are much more complex. This callout actually exists twice on the ALE_AUTH_CONNECT layer, with different weights and different provider data. The callout uses the provider data to decide which usermode component to ask what it should do, and there are two components which need to have an opportunity to decide at different points in the processing of the request. Each instance has its own injection handle.

So, what generally happens is that the first one pends, clones, and blocks the packet, the usermode app gives the OK, so we call FwpsContinueOperation0 and then reinject the NBL if the NBL is not null (it is null for TCP).

Now, the second instance gets called during FwpsContinueOperation0, and when it does, the NBL indicated to it is NULL. It then calls FwpsPendOperation0, sees that the NBL is null so does not clone it, and then does the block/absorb trick. At this point, FwpsContinueOperation0 from the first instance returns, and we inject the cloned net buffer data.

This whole sequence of events seems a little off to me. Do you think this would work better if instead of different provider data, I assigned a new guid and made a new FWP[SM]_CALLOUT0 which calls the same functions?

Thanks for your help,
Jeremy

JeremyDrake at 2007-10-3 > top of Msdn Tech,Windows Networking Development,Windows Filtering Platform (WFP)...
# 3

Jeremy,

Can you please send a email to wfp@microsoft.com with your email such that we can arrange to download the crash dump from you?

thanks,

Biao.W.

BiaoWang[MSFT] at 2007-10-3 > top of Msdn Tech,Windows Networking Development,Windows Filtering Platform (WFP)...
# 4

I am attempting to get approval for this. Just so you don't think I've forgotten about it...

JeremyDrake at 2007-10-3 > top of Msdn Tech,Windows Networking Development,Windows Filtering Platform (WFP)...
# 5

A code restructuring in the way I did DNS lookups has made this crash happen much less often, but it does still occur, and I have confirmed that it does still happen in RC0.

I still have not heard back on my request to send you a crash dump, I will try again to get authorization to do this, as I think it would be obvious that it is in our best interest as well as yours that this bug be fixed before SP1 is released.

Assuming I do get approval, what format of crash dump do you want? I have a minidump and a kernel mode memory dump (the default format of the c:\windows\memory.dmp file) of the issue already sitting here for both beta3 and rc0.

JeremyDrake at 2007-10-3 > top of Msdn Tech,Windows Networking Development,Windows Filtering Platform (WFP)...
# 6
Jeremy Drake wrote:

I am attempting to get approval for this. Just so you don't think I've forgotten about it...

I got my approval, and sent an email to wfp@microsoft.com.

JeremyDrake at 2007-10-3 > top of Msdn Tech,Windows Networking Development,Windows Filtering Platform (WFP)...
# 7

We would need c:\windows\memory.dmp.

thanks,

Biao.W.

BiaoWang[MSFT] at 2007-10-3 > top of Msdn Tech,Windows Networking Development,Windows Filtering Platform (WFP)...