0% found this document useful (0 votes)

41 views

Xray: A Function Call Tracing System

The document summarizes XRay, a function call tracing system developed at Google to help debug performance issues in production C/C++ systems. XRay works by inserting small "no-op" sequences at function entry/exit points during compilation. At runtime, these can be overwritten to log precise timestamps and metadata without affecting program execution or requiring changes to the operating system. This allows gathering detailed traces on demand with moderate overhead. The traces provide insights into thread activity and bottlenecks.

Uploaded by

bacondropped

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

Xray: A Function Call Tracing System

Uploaded by

bacondropped

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

XRay: A Function Call Tracing System

Authors:
Alistair Veitch (
[email protected] )
Dean Berris ( [email protected] )
Eric Anderson ( [email protected] )
Nevin Heintze ( [email protected] )
Ning Wang ( [email protected] )
Contact:
[email protected]

Date: 2016-04-05

Table of Contents

Function Call Tracing in Production

Introducing XRay
How XRay Works
XRay Overheads
Implementation Details
Compiler-inserted Instrumentation Points
Runtime Logging and Support Library
Conversion and Analysis Tools
Current Work and Future Plans

Abstract

Debugging high throughput, low-latency C/C++ systems in production is hard. At Google we

developed XRay, a function call tracing system that allows Google engineers to get accurate
function call traces with negligible overhead when off and moderate overhead when on,
suitable for services deployed in production. XRay enables efficient function call entry/exit
logging with high accuracy timestamps, and can be dynamically enabled and disabled. This
white paper describes the XRay tracing system and its implementation. It also describes
future plans with open sourcing XRay and engaging open source communities.
Function Call Tracing in Production
At Google we run distributed systems handling massive amounts of concurrent requests
that do everything from serving a webpage to storing bytes on disks. Understanding the
performance of these complex systems is difficult given the abstractions involved,
asynchronous programming with callbacks, and even interaction between threads
synchronising amongst each other. It can be especially difficult to track down what affects
performance in rare events, which can be rare because they occur at the latency tail 1, or
because they are outlier requests to the service. Seeing what happens "at the tail" allows
software engineers to diagnose the performance of systems in these rare events that have
an impact on the overall performance of the system.

The question then becomes how do you find these "needle in a haystack" events while
running in production? Time-based traces let developers track down these issues when
standard approaches such as sampling, logging, and aggregate statistics fail. What kind of
time-based trace data would be useful?

Ideally, we'd like to get non-sampled function call traces with high precision timestamps
from production servers that tell us which threads are running what code and when.
However, instrumenting all functions is usually too expensive for production use. To deploy
in production, we require that:

● The cost is acceptable when tracing and barely measurable when not tracing.
● Instrumentation is automatic and directed towards functions that are important for
understanding the binary’s execution time.
● Tracing is efficient in both space and time -- only recording what is required and what
matters.

In addition to the above, we also require:

● Tracing is configurable with thresholds for storage (how much memory to use) and
accuracy (whether to log everything or only function calls taking at least some
amount of time).
● Tracing does not require changes to the operating system nor super-user privileges.
● Tracing can be turned on and off dynamically without having to restart the server.

1
For more on why optimising tail latency matters, refer to "
The Tail at Scale
", by Jeff Dean and Luiz
Barroso, published to the Communications of the ACM, vol. 56 (2013), pp. 74-80.
1
Introducing XRay
XRay is a function call tracing system developed internally at Google for debugging
performance issues in production servers. XRay is not specific to a class of applications and
is applicable to any C/C++-based binary -- from storage servers handling multiple thousands
of requests per second to debugging command-line tools and unit tests. XRay is a suite of
tools integrated to provide insights into how an application is performing through a
combination of compiler infrastructure, a runtime library, and post-tracing analysis tools.

In more detail, XRay consists of the following:

● Changes to the compiler to insert small "no-op" code sequences in function entry
and exit points, which at runtime get patched to enable function-level tracing.
● A runtime library for logging function entry/exit events that dynamically prunes
recorded events to maximize the ‘value-per-MB’ of the recorded traces. .
● A tool that reconstructs timestamped function call trees from the XRay traces.
● Analysis tools that take the function call logs and highlight potential issues such as
lock contention.

This allows engineers to build and deploy a single XRay-instrumented binary to production
that can be used both for standard deployment (tracing turned off, overhead minimal) as
well as for debugging . When debugging live systems, the runtime library gathers trace data
on demand which provides very detailed function call traces, as well as input to analysis
tools.

XRay is one part of a suite of tools for debugging systems at Google. It has been used
successfully to identify latency issues in storage systems, web serving, ads serving, as well as
other services in production.

How XRay Works

XRay relies on compiler changes to insert no-op sleds2 in function entry/exits, and record
those locations in tables encoded in the object files. At runtime, if XRay is disabled, these
no-op sleds are executed as-is and add minimal execution overhead. However, if XRay is
enabled, the XRay runtime library overwrites these no-ops with calls to instrumentation
code that logs function entry/exit information to in-memory buffers. These logs also contain

2
These "no-op sleds" are a few bytes of code emitted by the compiler that do nothing.
2
cycle-counter timestamps and enough metadata to reconstruct the program's operation in
post-processing.

XRay works in fully multithreaded programs. Turning tracing on introduces code at runtime
to insert instrumentation at the right sections of the program code. Turning tracing off
undoes these changes. XRay keeps the program semantics intact without having to force
single processor mode or stopping execution of the program while the instrumentation is
being added and enabled.

XRay Overheads

To minimize the costs of the no-op sleds inserted at function entry and exit points, XRay
employs heuristics to determine which functions to instrument. XRay will only instrument
functions that are either:

● At least N instructions. By default N is 200, based on an assumption that really small

functions (without loops in them) are unlikely to take a significant amount of time
and therefore are not worth instrumenting. This can be tuned by flags at compilation
time to tweak the number of instructions to use as a threshold.
● Containing non-trivial loops. This assumes that loops may be introducing a variable
amount of time, and that users in general would like to see this variability accounted
for by XRay.

When tracing is enabled, we've measured an increase of between 20-40% in CPU usage with
a proportional in execution time. Because of this known overhead, we usually enable XRay
tracing and collection for brief periods of time to minimise the effect on live running
systems and collect useful data from services under load. Typically we collect around a few
seconds of data, which requires around 200MB of memory on a busy server.

We also typically only see up to a 2% increase in binary size with the XRay instrumentation
points.

Implementation Details
XRay has a few moving parts that work together to provide an accurate picture of what
functions are running in a server. We cover each of those moving parts in detail in the
following sections.

3
Compiler-inserted Instrumentation Points
At Google, we have patches on top of GCC that implement the XRay instrumentation point
insertion heuristics. The result of these patches yield code that in x86 assembler looks like:

local_block_sled_0:
jmp . + 0x09
(9 bytes worth of nops)
... # function prologue starts, followed by the body.
... # function epilogue starts, just before ret...
local_block_sled_1:
retq
(10 bytes worth of nops)

Each object file then has two named sections ( __function_patch_prologue,

__function_patch_epilogue ) where the various addresses of the sled labels
local_block_sled_
( N) are kept. These sections gets consolidated when the binary is
linked statically.

As mentioned earlier, we apply a heuristic on each function encountered by the compiler

when emitting the assembler. The heuristic can be thought of as an implementation of the
following pseudocode:

if instruction_count(function) > threshold

OR has_non_trivial_loop(function):
xray_instrument(function)
emit_function(function)

The threshold is user-controllable as a command-line option.

XRay also supports attributes that can be added to functions to ensure that they are always
instrumented if the XRay option is enabled in the compiler. These attributes take the form:

__attribute__((always_patch_for_instrumentation))
void Function() {
...
}

These attributes can be explicitly provided for functions that should always be instrumented
(e.g. functions for locking mutexes for lock contention analysis) and treated as special by the
runtime library.

4
Runtime Logging and Support Library
At runtime, XRay provides APIs to install the appropriate patch instructions into the
instrumentation sleds. This API also includes a mechanism for installing a logging function
called for every function entry and exit sled. The library loads the instrumentation map and
goes through and patches the following instructions:

#(1) for function entry sleds

mov $<function id>, %r10d
call __xray_FunctionEntryStub

#(2) for function exit sleds

mov $<function id>, %r10d
jmp __xray_FunctionExitStub

The process of patching and un-patching the instrumentation sleds is:

1. We go through each entry in the instrumentation map either loaded at runtime from
an external file or from the special sections in the binary and do the following:
a. For the function entry sleds, we replace the jmp and 9-byte nops with #(1)
above by writing the last 9 bytes first (the call), then atomically writing the
first two bytes (the mov
) over the `jmp . + 0x09 ` instruction.
b. For the function exit sleds, the process is similar to preserve correct execution
in multithreaded environments.
2. Once all the sleds have been patched, we atomically set a flag at runtime checked by
both the __xray_FunctionEntryStub and __xray_FunctionExitStub
functions to enable function call logging.
3. When tracing is explicitly turned off, we do either one of the following:
a. If asked to explicitly "unpatch" the code, we go through the instrumentation
map and reverse the process done in 2, 1.a, and 1.b.
b. If we only disable the logging but keep the instrumentation in place, we
atomically set the flag checked by both __xray_FunctionEntryStub and
__xray_FunctionExitStub to false. As an optimisation we also change the
__xray_FunctionEntryStub
first instruction in both and
__xray_FunctionExitStub to be an immediate `ret` to further lower the
cost of the instrumentation functions.

This process has been tuned to be safe on multithreaded applications, allowing the process
to continue making progress while instrumentation is being added and removed. The atomic
updates above makes sure that the program counter for all cores always points to a valid
instruction.

5
The choice in 3 above is determined by whether the user just wants to disable the logging
functionality for a period of time. This has been useful for periodic tracing of the same
application running in production at different times. Collecting sample traces is one way of
gathering enough data for offline analysis.

The __xray_FunctionEntryStub __xray_FunctionExitStub

and functions call
logging functions for entry and exit logging respectively. These logging functions do the
following:

LogFunctionEntry :
RDTSC
● Get a cycle counter ( in x86)
● Log an entry to a thread-specific buffer of the following form:
○ Function ID, Function Entry Identifier, Cycle counter delta
● Returns to the calling function.

LogFunctionExit :
RDTSC
● Get a cycle counter ( in x86)
● Logs an entry of the form:
○ Function ID, Function Exit Identifier, Cycle counter delta
● Returns to the calling function.

Each thread will be provided an 8 Kilobyte (KB) buffer the first time a thread needs to write a
log entry. Each 8KB block is dedicated to a thread, eliminating the need to synchronise
among multiple threads and implicitly maps blocks to threads without having to record the
thread ID for every record. These blocks have a 192-byte header and 1000 8-byte (64-bit)
chunks for each record. Each entry as described above will be:

<metadata (8 bits), function id (24 bits), 32 bits Cycle counter delta>

The 32 bits come from a 64 bit TSC discarding the 10 least significant and highest 22 bits.
This represents microsecond accuracy for each log entry in this buffer. The 8 bits of
metadata indicate whether this entry is a function entry, function exit, or a metadata entry.
For example, a metadata entry is used to indicate when a TSC wraps around. Further
optimisations also allow for writing just 4 bytes for a record (6 bits metadata, 19 bits
function id, and 8 bits cycle counter delta).

The header bytes include information about the thread (the processid and thread id in
Linux), and a reference 64-bit TSC and a wallclock entry in microsecond precision (from
gettimeofday). These are used in post-processing to stitch together the timestamps from TSC
in microsecond granularity to allow conversion to human-readable timestamps relative to
some reference wallclock time.

6
The logging functions by default prune records that are less than 5 microseconds equivalent
in walltime deduced from the cycle counter deltas. This allows XRay to retain only records
that have a measurable impact in walltime. This behaviour is based on internal
experimentation and is configurable.

The remaining part of the logging implementation is a "dump data" function, which gathers
all the thread-specific buffers used by the tracing system and either makes the data
available in memory as an iterable set of 8K chunks of data or writes it out to different files
with a predefined naming convention. The XRay library uses a filename provided via a
commandline option. Each thread's chunks will be written out into different files containing
the thread id. The files will contain the raw XRay trace logs which will later be analysed by a
separate set of tools.

When the thread runs out of space for a given block it returns this block to, and gets another
block from, a circular buffer of blocks.

Conversion and Analysis Tools

XRay provides a library to allow for reconstructing function call trees from the raw trace data
into a canonical data format for events. We have internal tools that convert the event data
files into different formats:

● HTML Trace File. We turn the data into a standalone interactive spreadsheet-like
HTML file that shows all the function calls that were recorded, which threads they
showed up in, in which RPCs they were being processed in, and how long each
function call took. We've implemented some optimisations in terms of encoding the
data so that search and filtering functionality is efficiently implemented in self-hosted
JavaScript. This has been used to debug both rare and common performance issues
(expensive copies of large objects, memory allocation/deallocation patterns, etc.)
● Domain specific analysis results. We also have internal tooling that allows for writing
analysers using an analysis framework to perform domain specific pattern matching.
We've used this to find certain call patterns that usually show common problems
(slow writes, lock contention, thread residency analysis, etc).

In Google, XRay is used as one data source within a more comprehensive performance
analysis and debugging suite of tools.

7
Current Work and Future Plans
We are committed to making XRay available as open source software, and as such we are
engaging the LLVM3 community to get all of the pieces of XRay released -- changes to the
compilers, the runtime library, and ways to work with the generated traces. We are also
committing to engaging open source communities to build the tools to make XRay data
more useful. We believe that function call tracing is one tool in a toolbox of debugging and
performance analysis tools that every developer should have available.

Once XRay is released and widely available, we are going to work with developers willing to
port XRay into other machine architectures and operating systems. We will actively be
involved in maintaining XRay and improving it based on feedback from the open source
community. Because XRay is used at Google to find hard-to-debug problems, we will make
sure that XRay continues to be an actively maintained project for the long term.

We are also going to engage potential users and tool builders that want to make
performance debugging a more pleasant experience for the C/C++ developer community as
well as other languages.

Acknowledgements
XRay was done at Google by Harshit Chopra, Robert Bowdidge, Sanjay Bhansali, and Vlad
Losev with additional contributions by David Goldblatt . Without their efforts and investment
in building XRay, the team currently working with it (Alistair Veitch, Dean Berris, Eric
Anderson, Nevin Heintze, Ning Wang) and other Google teams would have a very hard time
finding performance bugs in production systems. We also would like to thank Eric
Christopher, Chandler Carruth, Matt Austern, and Andrew Fikes who reviewed an early draft
of this white paper.

3
The LLVM Project is at
http://llvm.org/
.
8

Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
Hyper-V Security Guide
No ratings yet
Hyper-V Security Guide
42 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
SRS - How to build a Pen Test and Hacking Platform
From Everand
SRS - How to build a Pen Test and Hacking Platform
alasdair gilchrist
2/5 (1)
AppBuild PDF
No ratings yet
AppBuild PDF
312 pages
K-TAG Master: Instruction Manual and User's Guide
100% (3)
K-TAG Master: Instruction Manual and User's Guide
45 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
From Everand
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
Mario Marinov
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
20 Windows Tools Every SysAdmin Should Know
From Everand
20 Windows Tools Every SysAdmin Should Know
padmin
5/5 (2)
Terraform for Developers, Second Edition
From Everand
Terraform for Developers, Second Edition
Kimiko Lee
No ratings yet
Terraform for Developers, Second Edition: Essentials of Infrastructure Automation and Provisioning
From Everand
Terraform for Developers, Second Edition: Essentials of Infrastructure Automation and Provisioning
Kimiko Lee
No ratings yet
Angular Performance Optimization: Everything you need to know
From Everand
Angular Performance Optimization: Everything you need to know
Abdelfattah Ragab
No ratings yet
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
From Everand
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
Mustafa Al-Dori
5/5 (1)
What's New in .NET 8? A Complete Guide to the Latest Features
From Everand
What's New in .NET 8? A Complete Guide to the Latest Features
Nitika
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Microsoft AZ-400: Designing and Implementing Microsoft DevOps Solutions - Study Notes
From Everand
Microsoft AZ-400: Designing and Implementing Microsoft DevOps Solutions - Study Notes
Steve Brown
No ratings yet
Inspiring Powershell Articles
From Everand
Inspiring Powershell Articles
Murat Yildirimoglu
No ratings yet
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
From Everand
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
Karl Josef Hensel
No ratings yet
Gprof: A Call Graph Execution Profiler: Susan L. Graham Peter B. Kessler Marshall K. Mckusick
No ratings yet
Gprof: A Call Graph Execution Profiler: Susan L. Graham Peter B. Kessler Marshall K. Mckusick
9 pages
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet
Mastering Shell for DevOps: Automate, streamline, and secure DevOps workflows with modern shell scripting
From Everand
Mastering Shell for DevOps: Automate, streamline, and secure DevOps workflows with modern shell scripting
Gilbert Stew
No ratings yet
Mastering Shell for DevOps
From Everand
Mastering Shell for DevOps
Gilbert Stew
No ratings yet
Airflow for Data Workflow Automation
From Everand
Airflow for Data Workflow Automation
Richard Johnson
No ratings yet
Mastering Go Network Automation: Automating Networks, Container Orchestration, Kubernetes with Puppet, Vegeta and Apache JMeter
From Everand
Mastering Go Network Automation: Automating Networks, Container Orchestration, Kubernetes with Puppet, Vegeta and Apache JMeter
Ian Taylor
No ratings yet
Mastering Go Network Automation
From Everand
Mastering Go Network Automation
Ian Taylor
No ratings yet
Rust In Practice, Second Edition
From Everand
Rust In Practice, Second Edition
Rick Tim
No ratings yet
Microsoft AZ-400: Designing and Implementing Microsoft DevOps Solutions - Certification Exam Prep
From Everand
Microsoft AZ-400: Designing and Implementing Microsoft DevOps Solutions - Certification Exam Prep
Steve Brown
No ratings yet
Linux Services Deployment
From Everand
Linux Services Deployment
Fabian Mestre
No ratings yet
Professional Heroku Programming
From Everand
Professional Heroku Programming
Chris Kemp
4/5 (2)
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
From Everand
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
ARCHER PAUL
No ratings yet
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
From Everand
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
Manoj R Chakravarthi
No ratings yet
Boost.Asio Techniques and Applications: Definitive Reference for Developers and Engineers
From Everand
Boost.Asio Techniques and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Asynchronous C++: Modern Techniques for High-Performance Concurrent Programming
From Everand
Mastering Asynchronous C++: Modern Techniques for High-Performance Concurrent Programming
Aarav Joshi
No ratings yet
SystemTap Essentials: Definitive Reference for Developers and Engineers
From Everand
SystemTap Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Hack into your Friends Computer
From Everand
Hack into your Friends Computer
Magelan Cyber Security
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
PowerShell: A Beginner's Guide to Windows PowerShell
From Everand
PowerShell: A Beginner's Guide to Windows PowerShell
Roger Wilson
4/5 (1)
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Practical Rust 1.x Cookbook: 100+ Solutions across Command Line, CI/CD, Kubernetes, Networking, Code Performance and Microservices
From Everand
Practical Rust 1.x Cookbook: 100+ Solutions across Command Line, CI/CD, Kubernetes, Networking, Code Performance and Microservices
Rustacean Team
No ratings yet
Practical Rust 1.x Cookbook
From Everand
Practical Rust 1.x Cookbook
Rustacean Team
No ratings yet
Learning PyTorch 2.0, Second Edition: Utilize PyTorch 2.3 and CUDA 12 to experiment neural networks and deep learning models
From Everand
Learning PyTorch 2.0, Second Edition: Utilize PyTorch 2.3 and CUDA 12 to experiment neural networks and deep learning models
Matthew Rosch
No ratings yet
Learning PyTorch 2.0, Second Edition
From Everand
Learning PyTorch 2.0, Second Edition
Matthew Rosch
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
JavaScript File Handling from Scratch: A Practical Guide with Examples
From Everand
JavaScript File Handling from Scratch: A Practical Guide with Examples
William E. Clark
No ratings yet
Getting Started With Quick Test Professional (QTP) And Descriptive Programming
From Everand
Getting Started With Quick Test Professional (QTP) And Descriptive Programming
Gaurav Garg
4.5/5 (2)
Logstash Made Easy: A Beginner's Guide to Log Ingestion and Transformation
From Everand
Logstash Made Easy: A Beginner's Guide to Log Ingestion and Transformation
Robert Johnson
No ratings yet
Shell Scripting Step by Step: A Practical Guide with Examples
From Everand
Shell Scripting Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
AppleScript Automation Guide: Definitive Reference for Developers and Engineers
From Everand
AppleScript Automation Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Administering ArcGIS for Server
From Everand
Administering ArcGIS for Server
Hussein Nasser
No ratings yet
Ganache for Ethereum Development: Definitive Reference for Developers and Engineers
From Everand
Ganache for Ethereum Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Python Network Automation: Automating Container Orchestration, Configuration, and Networking with Terraform, Calico, HAProxy, and Istio
From Everand
Mastering Python Network Automation: Automating Container Orchestration, Configuration, and Networking with Terraform, Calico, HAProxy, and Istio
Tim Peters
No ratings yet
JAVA PROGRAMMING FOR BEGINNERS: Master Java Fundamentals and Build Your Own Applications (2023 Crash Course)
From Everand
JAVA PROGRAMMING FOR BEGINNERS: Master Java Fundamentals and Build Your Own Applications (2023 Crash Course)
Theo Houle
No ratings yet
Oracle Recovery Appliance Handbook: An Insider’S Insight
From Everand
Oracle Recovery Appliance Handbook: An Insider’S Insight
Ramesh Raghav
No ratings yet
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
Building Telephony Systems with OpenSER
From Everand
Building Telephony Systems with OpenSER
Goncalves Flavio E.
No ratings yet
Mastering the Art of x86 Assembly Programming: Unlocking the Secrets of Expert-Level Skills
From Everand
Mastering the Art of x86 Assembly Programming: Unlocking the Secrets of Expert-Level Skills
Steve Jones
No ratings yet
Advanced Dtrace: Tips, Tricks and Gotchas
No ratings yet
Advanced Dtrace: Tips, Tricks and Gotchas
56 pages
Efficient Workflows with Emacs: Definitive Reference for Developers and Engineers
From Everand
Efficient Workflows with Emacs: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Swift Programming Simplified: A Practical Guide with Examples
From Everand
Swift Programming Simplified: A Practical Guide with Examples
William E. Clark
No ratings yet
Java Streams Explained: A Practical Guide with Examples
From Everand
Java Streams Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Space Shuttle Entry Terminal Area Energy Management: NASA Technical Memorandum 104744
No ratings yet
Space Shuttle Entry Terminal Area Energy Management: NASA Technical Memorandum 104744
54 pages
Ry U: Fast Float-to-String Conversion: Ulf Adams
No ratings yet
Ry U: Fast Float-to-String Conversion: Ulf Adams
13 pages
Pivotal Mental States: Ari Brouwer and Robin Lester Carhart-Harris
No ratings yet
Pivotal Mental States: Ari Brouwer and Robin Lester Carhart-Harris
34 pages
687204main - KSC Case Study Style Guide
No ratings yet
687204main - KSC Case Study Style Guide
3 pages
N Device Federated Learning With Lower
No ratings yet
N Device Federated Learning With Lower
5 pages
A Characterization of Authenticated-Encryption As A Form of Chosen-Ciphertext Security
No ratings yet
A Characterization of Authenticated-Encryption As A Form of Chosen-Ciphertext Security
7 pages
On The Dynamics of Small Continuous-Time Recurrent Neural Networks
No ratings yet
On The Dynamics of Small Continuous-Time Recurrent Neural Networks
57 pages
8 CPUScheduling
No ratings yet
8 CPUScheduling
7 pages
1 s2.0 S1349007912000801 Main
No ratings yet
1 s2.0 S1349007912000801 Main
7 pages
A Faster Earley Parser: Abstract
No ratings yet
A Faster Earley Parser: Abstract
13 pages
CBLM Computer System Servicing Ncii
No ratings yet
CBLM Computer System Servicing Ncii
20 pages
XYZ Implementation Plan
No ratings yet
XYZ Implementation Plan
29 pages
Hiteless ISSU
No ratings yet
Hiteless ISSU
68 pages
Module TVL Ict Css
No ratings yet
Module TVL Ict Css
26 pages
Bcs 51 June 21 Notes
No ratings yet
Bcs 51 June 21 Notes
8 pages
ESC White Paper On Maritime Cyber Security 2016 02
No ratings yet
ESC White Paper On Maritime Cyber Security 2016 02
13 pages
Linux Kernel Development
No ratings yet
Linux Kernel Development
20 pages
Managing HP Servers
No ratings yet
Managing HP Servers
16 pages
SDO Navotas TLE10 CSS Q3 M1 Installing-And-Configuring-Computer-Systems FVa
100% (1)
SDO Navotas TLE10 CSS Q3 M1 Installing-And-Configuring-Computer-Systems FVa
10 pages
Move BGL Software To A New Stand-Alone PC or Laptop
No ratings yet
Move BGL Software To A New Stand-Alone PC or Laptop
13 pages
Control Desk Calibration and Data Set Management
No ratings yet
Control Desk Calibration and Data Set Management
168 pages
CIT 204 Networking Administration and Management CO
No ratings yet
CIT 204 Networking Administration and Management CO
3 pages
Upgrade DB 10.2.0.4 12.1.0
No ratings yet
Upgrade DB 10.2.0.4 12.1.0
15 pages
Dcu Ug
No ratings yet
Dcu Ug
34 pages
Unix Kernel Update 2
No ratings yet
Unix Kernel Update 2
4 pages
Learning Check Answers
No ratings yet
Learning Check Answers
5 pages
EDT Drilling Summary5000.1.7ReleaseNotes PDF
No ratings yet
EDT Drilling Summary5000.1.7ReleaseNotes PDF
423 pages
VxRail - VxRail and External Vcenter Interoperability Matrix - Dell US
No ratings yet
VxRail - VxRail and External Vcenter Interoperability Matrix - Dell US
7 pages
TC2887en-ed05 New OXE Features in Release R100.0
No ratings yet
TC2887en-ed05 New OXE Features in Release R100.0
65 pages
G1onG1XonB1onB1Xon Firmware Update Guide v1.21 Win E
0% (1)
G1onG1XonB1onB1Xon Firmware Update Guide v1.21 Win E
4 pages
SIMtrace User Manual
No ratings yet
SIMtrace User Manual
21 pages
M02 - Configuring and Administering Server
100% (3)
M02 - Configuring and Administering Server
77 pages
Avaya Communication Manager Alarms Events and Logs Reference R8.1.3 Dec2020
No ratings yet
Avaya Communication Manager Alarms Events and Logs Reference R8.1.3 Dec2020
1,827 pages
Nomad Datasheet
No ratings yet
Nomad Datasheet
2 pages
ITCAM For Microsoft Applications BizTalk Server Agent V6.3.1.2 Troubleshooting Guide
No ratings yet
ITCAM For Microsoft Applications BizTalk Server Agent V6.3.1.2 Troubleshooting Guide
70 pages
51 Point Aws Security Configuration Checklist PDF
No ratings yet
51 Point Aws Security Configuration Checklist PDF
7 pages
1583082352381
No ratings yet
1583082352381
11 pages

Xray: A Function Call Tracing System

Uploaded by

Xray: A Function Call Tracing System

Uploaded by

XRay: A Function Call Tracing System

Function Call Tracing in Production

Debugging high throughput, low-latency C/C++ systems in production is hard. At Google we

In addition to the above, we also require:

In more detail, XRay consists of the following:

How XRay Works

● At least N instructions. By default N is 200, based on an assumption that really small

Each object file then has two named sections (​ __function_patch_prologue,

As mentioned earlier, we apply a heuristic on each function encountered by the compiler

if instruction_count(function) > threshold

The threshold is user-controllable as a command-line option.

#(1) for function entry sleds

#(2) for function exit sleds

The process of patching and un-patching the instrumentation sleds is:

The ​__xray_FunctionEntryStub​ __xray_FunctionExitStub​

<metadata (8 bits), function id (24 bits), 32 bits Cycle counter delta>

Conversion and Analysis Tools

You might also like

Each object file then has two named sections ( __function_patch_prologue,

The __xray_FunctionEntryStub __xray_FunctionExitStub