Please Help -BSOD!!

Status
Not open for further replies.
Hi Guys,

Need some urgent help,I have a - Proliant ML570 G2, running Windows Server 2003-Std Edition blue screening very often.today it has BSODed 3 times. I did try to Debug and it points to a driver - intelppm.sys. I have no clue about this driver. Did search Google,got some hits for Virtual PCs(this server is not a Vitrual Server).No one is talking about a 2003 Server blue screening because of intelppm.sys.

Am pasting the Dump analysis here. Please help me out.

7: kd> ! analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

MACHINE_CHECK_EXCEPTION (9c)
A fatal Machine Check Exception has occurred.
KeBugCheckEx parameters;
x86 Processors
If the processor has ONLY MCE feature available (For example Intel
Pentium), the parameters are:
1 - Low 32 bits of P5_MC_TYPE MSR
2 - Address of MCA_EXCEPTION structure
3 - High 32 bits of P5_MC_ADDR MSR
4 - Low 32 bits of P5_MC_ADDR MSR
If the processor also has MCA feature available (For example Intel
Pentium Pro), the parameters are:
1 - Bank number
2 - Address of MCA_EXCEPTION structure
3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
4 - Low 32 bits of MCi_STATUS MSR for the MCA bank that had the error
IA64 Processors
1 - Bugcheck Type
1 - MCA_ASSERT
2 - MCA_GET_STATEINFO
SAL returned an error for SAL_GET_STATEINFO while processing MCA.
3 - MCA_CLEAR_STATEINFO
SAL returned an error for SAL_CLEAR_STATEINFO while processing MCA.
4 - MCA_FATAL
FW reported a fatal MCA.
5 - MCA_NONFATAL
SAL reported a recoverable MCA and we don't support currently
support recovery or SAL generated an MCA and then couldn't
produce an error record.
0xB - INIT_ASSERT
0xC - INIT_GET_STATEINFO
SAL returned an error for SAL_GET_STATEINFO while processing INIT event.
0xD - INIT_CLEAR_STATEINFO
SAL returned an error for SAL_CLEAR_STATEINFO while processing INIT event.
0xE - INIT_FATAL
Not used.
2 - Address of log
3 - Size of log
4 - Error code in the case of x_GET_STATEINFO or x_CLEAR_STATEINFO
AMD64 Processors
1 - Bank number
2 - Address of MCA_EXCEPTION structure
3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
4 - Low 32 bits of MCi_STATUS MSR for the MCA bank that had the error
Arguments:
Arg1: 00000000
Arg2: f775d2b0
Arg3: b2000000
Arg4: 1020080f

Debugging Details:
------------------

NOTE: This is a hardware error. This error was reported by the CPU
via Interrupt 18. This analysis will provide more information about
the specific error. Please contact the manufacturer for additional
information about this error and troubleshooting assistance.

This error is documented in the following publication:

- IA-32 Intel(r) Architecture Software Developer's Manual
Volume 3: System Programming Guide

Bit Mask:

MA Model Specific MCA
O ID Other Information Error Code Error Code
VV SDP ___________|____________ _______|_______ _______|______
AEUECRC| | | |
LRCNVVC| | | |
^^^^^^^| | | |
6 5 4 3 2 1
3210987654321098765432109876543210987654321098765432109876543210
----------------------------------------------------------------
1011001000000000000000000000000000010000001000000000100000001111


VAL - MCi_STATUS register is valid
Indicates that the information contained within the IA32_MCi_STATUS
register is valid. When this flag is set, the processor follows the
rules given for the OVER flag in the IA32_MCi_STATUS register when
overwriting previously valid entries. The processor sets the VAL
flag and software is responsible for clearing it.

UC - Error Uncorrected
Indicates that the processor did not or was not able to correct the
error condition. When clear, this flag indicates that the processor
was able to correct the error condition.

EN - Error Enabled
Indicates that the error was enabled by the associated EEj bit of the
IA32_MCi_CTL register.

PCC - Processor Context Corrupt
Indicates that the state of the processor might have been corrupted
by the error condition detected and that reliable restarting of the
processor may not be possible.

BUSCONNERR - Bus and Interconnect Error BUS{LL}_{PP}_{RRRR}_{II}_{T}_err
These errors match the format 0000 1PPT RRRR IILL



Concatenated Error Code:
--------------------------
_VAL_UC_EN_PCC_BUSCONNERR_F

This error code can be reported back to the manufacturer.
They may be able to provide additional information based upon
this error. All questions regarding STOP 0x9C should be
directed to the hardware manufacturer.

BUGCHECK_STR: 0x9C_IA32_GenuineIntel

DEFAULT_BUCKET_ID: DRIVER_FAULT

PROCESS_NAME: Idle

CURRENT_IRQL: 2

LAST_CONTROL_TRANSFER: from 80a5cbd8 to 80827451

STACK_TEXT:
f7775280 80a5cbd8 0000009c 00000000 f77752b0 nt!KeBugCheckEx+0x1b
f77753b4 80a5486f f776ffe0 00000000 00000000 hal!HalpMcaExceptionHandler+0x11e
f77753b4 bae2aca2 f776ffe0 00000000 00000000 hal!HalpMcaExceptionHandlerWrapper+0x77
f78e2d50 8088d262 00000000 0000000e 00000000 intelppm!AcpiC1Idle+0x12
f78e2d54 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0xa


STACK_COMMAND: kb

FOLLOWUP_IP:
intelppm!AcpiC1Idle+12
bae2aca2 6a00 push 0

SYMBOL_STACK_INDEX: 3

SYMBOL_NAME: intelppm!AcpiC1Idle+12

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: intelppm

IMAGE_NAME: intelppm.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 42435b38

FAILURE_BUCKET_ID: 0x9C_IA32_GenuineIntel_intelppm!AcpiC1Idle+12

BUCKET_ID: 0x9C_IA32_GenuineIntel_intelppm!AcpiC1Idle+12

Followup: MachineOwner
---------

Also,when the server is coming up after bug check,It does post error's related to Fibre channel,as below

Event Type: Error
Event Source: Storage Agents
Event Category: Events
Event ID: 1215
Date: 11/22/2006
Time: 11:26:18 AM
User: N/A
Computer: DCASAPP006
Description:
Fibre Channel Controller Status Change. The Fibre Channel Controller in Slot 6 has a new status of 6.
(Host controller status values: 1=other, 2=ok, 3=failed, 4=shutdown, 5=connectionDegraded, 6=connectionFailed)
[SNMP TRAP: 16028 in CPQFCA.MIB]



Please assist


Thanks in Advance

Jay.
 
Intelppm.sys is the CPU driver. I've had the same error on my system (AMD), the solution was to uninstall the CPU driver.

The AMD CPU driver is only required if you use Cool 'n' Quiet (power management), if you uninstall it, Windows will use it's own driver. I'm assuming that Intel is the same, you only need the driver for the power management feature (SpeedStep).
 
Hi Peter,

Many Thanks for your suggestion. Would like to confirm that-Disabling this driver will not affect the server in no other way other than power management feature (SpeedStep)


Regards

Jay
 
Status
Not open for further replies.
Back