Slideplayer Com Slide 10294885
Slideplayer Com Slide 10294885
Similar presentations
VMware vSphere® 6.0 Knowledge Transfer Kit 624
Published by Baldwin Hall Modified over 7 years ago
4 ESXi
5 Components of ESXi
The ESXi architecture comprises the underlying operating system, called the VMkernel, and processes that run
on top of it
VMkernel provides a means for running all processes on the system, including management applications and
agents as well as virtual machines
It has control of all hardware devices on the server and manages resources for the applications
The main processes that run on top of VMkernel are
Direct Console User Interface (DCUI)
Virtual Machine Monitor (VMM)
VMware Agents (hostd, vpxa)
Common Information Model (CIM) System
13 ESXi Troubleshooting
Troubleshooting ESXi is very much the same as any operating system
Start by narrowing down the component which is causing the problem
Next review the logs as required to narrow down the issue
Common log files are as follows
/var/log/auth.log: ESXi Shell authentication success and failure
/var/log/esxupdate.log: ESXi patch and update installation logs
/var/log/hostd.log: Host management service logs, including virtual machine and host Task and Events,
communication with the vSphere Client and vCenter Server vpxa agent, and SDK connections
/var/log/syslog.log: Management service initialization, watchdogs, scheduled tasks and DCUI use
/var/log/vmkernel.log: Core VMkernel logs, including device discovery, storage and networking device and
driver events, and virtual machine startup
/var/log/vmkwarning.log: A summary of Warning and Alert log messages excerpted from the VMkernel logs
/var/log/vmksummary.log: A summary of ESXi host startup and shutdown, and an hourly heartbeat with
uptime, number of virtual machines running, and service resource consumption
/var/log/vpxa.log: vCenter Server vpxa agent logs, including communication with vCenter Server and the Host
Management hostd agent
/var/log/fdm.log: vSphere High Availability logs, produced by the FDM service
15 Virtual Machines
16 Virtual Machine Troubleshooting
Virtual machines run as processes on the ESXi host
Troubleshooting is split into two categories
Inside the Guest OS – Standard OS troubleshooting should be used, including the OS-specific log files
ESXi host level troubleshooting – Concerning the virtual machine process, where the log file for the virtual
machine is reviewed for errors
ESXi host virtual machine log files are located in the directory which the virtual machine runs by default, and
are named vmware.log
Generally issues occur as a result of a problem in the guest OS
Host level crashes of the VM processes are relatively rare and are normally a result of hardware errors or
compatibility of hardware between hosts
18 vCenter Server
19 vCenter Server 6.0 with Embedded Platform Services Controller
SSO
CM
License
IS
Web
TOOLS
Platform Services Controller
Management Node
Sufficient for most environments
Easiest to maintain and deploy
Recommended - 8 or less vCenter Servers
vCenter Server and the infrastructure controller are deployed on a single virtual machine or physical host.
vCenter Server with embedded infrastructure controller is suitable for smaller environments with eight or less
product instances.
To provide the common services, such as vCenter Single Sign-On, across multiple products and vCenter Server
instances, you can connect multiple vCenter Server instances with embedded infrastructure controllers
together.
You can do this by replicating the vCenter Single Sign-On data from one of the Infrastructure Controller to the
other Infrastructure Controllers. This way, infrastructure data for each product is replicated to all of the
infrastructure controllers, and each individual infrastructure controller contains a copy of the data for all of
the infrastructure controllers.
The Embedded Infrastructure Controller supports both an internal database, which is vPostgres or external
database, such as Oracle and Microsoft Server.
The vCenter Server 6.0 with Embedded Infrastructure Controller is available for both Windows and Virtual
Appliance format.
Supports embedded and external database
Available for Windows and vCenter Server Appliance
27 vCenter Troubleshooting
vCenter for windows has been consolidated and organized in this release
Installation and logging directories mimic the vCenter Server Appliance in previous releases
Start by narrowing down the component which is causing the problem
Next review the logs as required to narrow down the issue
Each process now has its own logging directory:
33 vSphere vMotion
34 vSphere vMotion and vSphere Storage vMotion Troubleshooting
vSphere vMotion and vSphere Storage vMotion are some of the best logged features in vSphere
Each migration that occurs has a unique Migration ID (MID) that can be used to search logs for the vSphere
vMotion and vSphere Storage vMotion
MIDs look as follows:
Each time a vSphere vMotion and vSphere Storage vMotion is attempted, all logs can be reviewed to find the
error using grep and searching for the term Migrate
Both the source and the destination logs should be reviewed
The following is a list of common log files and errors
VMKernel.log – VMkernel logs usually contain storage or network errors (and possibly vSphere vMotion and
vSphere Storage vMotion timeouts)
hostd.log – contains interactions between vCenter and ESXi
vmware.log – virtual machine log file which will show issues with starting the virtual machine processes
vpxd.log – vSphere vMotion as seen from vCenter normally shows a timeout or other irrelevant data because
the errors are occurring on the host itself
39 Availability
vSphere High Availability
44 vSphere High Availability Technical Details – Master and Slave Summary Views
45 vSphere High Availability Technical Details – Master Election
A master is elected when the following conditions occur
vSphere High Availability is enabled
A master host fails
A management network partition occurs
The following algorithm is used for selecting the master
If a host has the greatest number of datastores, it is the best host
If there is a tie, then the host with the lexically highest moid is chosen. For example moid "host-99" would be
higher than moid "host-100" since 9 is greater than 1
After a master is elected and contacts vCenter, vCenter sends a compatibility list to the master which saves it
on its local disk, and then pushes it out to the slave hosts in the cluster
vCenter normally only talks to a master. It will sometimes talk to FDM agents on other hosts, especially if
master states that it cannot reach the slave agent. vCenter will try to contact the other host to figure out why
Moid – Managed Object ID – vCenter identifier
There are some other scenarios when vCenter will talk to the other FDM agents
When scanning for master
When vCenter powers on a vSphere FT secondary VM
When host is reported isolated or partitioned
46 vSphere High Availability Technical Details – Partitioning
Under normal operating conditions, there is only one master
However, if a management network failure occurs, a subset of the hosts might become isolated. This means
that they cannot communicate with the other hosts in the cluster over the management network
In such a situation, when the hosts can continue to ping the isolation response IP, but not other hosts, FDM is
called network partitioned
Each partition without an existing master will elect a new one
Thus, a partitioned cluster state will have multiple masters, one per partition
However, vCenter cannot report back on more than one master, so you could be getting only one partition
details – the master that vCenter finds first
When a network partition is corrected, one of the masters will take over from the others, thus reverting back to
a single master
52 Availability
vSphere FT
53 vSphere FT Troubleshooting
vSphere FT has been completely rewritten in vSphere 6.0
Now, CPU compatibility is the same as vSphere vMotion compatibility because the same technology is used to
ship memory, CPU, storage, and network states across to the secondary virtual machine
When troubleshooting
Get logs for both primary and secondary VMs and hosts
Grab logs before log rotation
Ensure time is synchronized on all hosts
When reviewing the configuration, you should find both primary and secondary VMX logs in the primary VMs
directory
They will named vmware.log and vmware-snd.log
Also, be sure to review vmkernel.log and hostd.log from both the primary and secondary hosts for errors
54 vSphere FT Troubleshooting – General Things To Look For (vmkernel, vmx)
T18:12:25.892Z cpu3:35660)FTCpt: 2401: ( pri) Primary init: nonce
T18:12:25.892Z cpu3:35660)FTCpt: 2440: ( pri) Setting allowedDiffCount = 64
T18:12:25.892Z cpu3:35660)FTCpt: 1217: Queued accept request for ftPairID
T18:12:25.892Z cpu3:35660)FTCpt: 2531: ( pri) vmx vmm 35662
T18:12:25.892Z cpu1:32805)FTCpt: 1262: ( pri) Waiting for connection
Generally, multiprocessor vSphere FT messages will prefix with “FTCpt:” in vmkernel and vmx logs.
Like vMotion, vSphere FT sessions have an vSphere FT id unique identifier, taken from the migration id that
started it, shared by: vmx, vmkernel, primary, and secondary (can be used to verify all logs present).
The role of the VM is either “pri” or “snd”.
vSphere FT messages will prefix with “FTCpt:”
Like vSphere vMotion, vSphere FT sessions have an vSphere FT id unique identifier taken from the migration ID
that started it
The role of the VM is either “pri” or “snd”
58 Availability
vSphere Distributed Resource Scheduler
59 DRS Troubleshooting
DRS uses a proprietary algorithm to assess and determine resource usage and to determine which hosts to
balance VMs to
DRS primarily uses vMotion to facilitate movements
Troubleshooting failures generally consist of figuring out why vMotion failed, and not DRS itself as the
algorithm just follows resource utilization
Ensure the following
vSphere vMotion is enabled and configured
The migration aggressiveness is set appropriately
Fully automated if approvals are not needed for migrations
To test DRS, from the vSphere Web Client, select the Run DRS option, which will initiate recommendations
Failures can be assessed and corrected at that time
61 Content Library
62 Content Library Troubleshooting
The Content Library is easy to troubleshoot because there are two basic areas to examine
Creation/administration of Content Libraries
This area consists of issues with the Content Library creation, storage backing, creation of and synchronizing
Content Library items, and subscription problems.
Log files are cls-debug.log / cls-cis-debug.log
They are located in /var/log/vmware/vdcs/ OR C:/ProgramData/Vmware/CIS/logs/vdcs
Synchronization of Content Libraries
This area consists of issues where there are synchronization failures and problems with adding items to a
content library. You can also track transfer session ids between cls-debug and ts-debug.
Log files are ts-debug.log / ts-cis-debug.log
They are located in /var/log/vmware/vdcs/ OR C:/ProgramData/Vmware/CIS/logs/vdcs
76 Storage
77 Storage Troubleshooting
Troubleshooting storage is a broad topic that very much depends on the type of storage in use
Consult the vendor to determine what is normal and expected for storage
In general, the following are problems that are frequently seen
Overloaded storage
Slow storage
91 Networking
92 Networking Troubleshooting
Troubleshooting networking is very similar to physical network troubleshooting
Start by validating connectivity
Look at network statistics from esxtop as well as the physical switch
Is it a network performance problem?
Validate throughput
Is CPU load too high?
Are packets being dropped?
Is the issue limited to the virtual environment, or is it seen in the physical environment too?
One of the biggest issues that VMware has observed is dropped network packets (discussed next)
97 Questions
98 VMware vSphere 6.0 Knowledge Transfer Kit
VMware, Inc Hillview Ave Palo Alto, CA 94304
Tel: or Fax:
© 2024 SlidePlayer.com Inc. Feedback Do Not Sell About project Search... Search
All rights reserved.
Privacy Policy My Personal SlidePlayer
Feedback Information Terms of Service