SlideShare a Scribd company logo
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 1/84
Pro ling and optimizing Go
programs
14 July 2016
Marko Kevac
Software Engineer, Badoo
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 2/84
Introduction
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 3/84
What is pro ling and optimization?
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 4/84
Pro ling on Linux
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 5/84
Pro ling on OSX
OSX pro ling xed in El Capitan.
Previous versions need binary patch.
godoc.org/rsc.io/pprof_mac_ x(https://godoc.org/rsc.io/pprof_mac_ x)
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 6/84
CPU
github.com/gperftools/gperftools(https://github.com/gperftools/gperftools)
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 7/84
CPU
pprof is a sampling pro ler.
All pro lers in Go can be started in a di erent ways, but all of them can be broken into
collection and visualization phase.
Example.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 8/84
Example
packageperftest
import(
"regexp"
"strings"
"testing"
)
varhaystack=`Loremipsumdolorsitamet...auctor...elit...`
funcBenchmarkSubstring(b*testing.B){
fori:=0;i<b.N;i++{
strings.Contains(haystack,"auctor")
}
}
funcBenchmarkRegex(b*testing.B){
fori:=0;i<b.N;i++{
regexp.MatchString("auctor",haystack)
}
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 9/84
Benchmark
$gotest-bench=.
testing:warning:noteststorun
BenchmarkSubstring-8 10000000 194ns/op
BenchmarkRegex-8 200000 7516ns/op
PASS
ok github.com/mkevac/perftest00 3.789s
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 10/84
Pro ling
$GOGC=offgotest-bench=BenchmarkRegex-cpuprofilecpu.out
testing:warning:noteststorun
BenchmarkRegex-8 200000 6773ns/op
PASS
ok github.com/mkevac/perftest00 1.491s
GOGC=o turns o garbage collector
Turning o GC can be bene cial for short programs.
When started with -cpupro le, go test puts binary in our working dir.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 11/84
Visualization
Linux
$gotoolpprofperftest00.testcpu.out
(pprof)web
OSX
$openhttps://www.xquartz.org
$ssh-Yserver
$gotoolpprofperftest00.testcpu.out
(pprof)web
Other
$gotoolpprof-svg./perftest00.test./cpu.out>cpu.svg
$scp...
$opencpu.svg
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 12/84
Visualization
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 13/84
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 14/84
Visualization
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 15/84
Fix
packageperftest
import(
"regexp"
"strings"
"testing"
)
varhaystack=`Loremipsumdolorsitamet...auctor...elit...`
varpattern=regexp.MustCompile("auctor")
funcBenchmarkSubstring(b*testing.B){
fori:=0;i<b.N;i++{
strings.Contains(haystack,"auctor")
}
}
funcBenchmarkRegex(b*testing.B){
fori:=0;i<b.N;i++{
pattern.MatchString(haystack)
}
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 16/84
Benchmark
$gotest-bench=.
testing:warning:noteststorun
BenchmarkSubstring-8 10000000 170ns/op
BenchmarkRegex-8 5000000 297ns/op
PASS
ok github.com/mkevac/perftest01 3.685s
What about call graph?
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 17/84
Visualization
We don't see compilation at all.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 18/84
Ways to start CPU pro ler
1. go test -cpupro le=cpu.out
2. pprof.StartCPUPro le() and pprof.StopCPUPro le() or Dave Cheney great package
github.com/pkg/pro le(https://github.com/pkg/pro le)
3. import _ "net/http/pprof"
Example
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 19/84
Example
packagemain
import(
"net/http"
_"net/http/pprof"
)
funccpuhogger(){
varaccuint64
for{
acc+=1
ifacc&1==0{
acc<<=1
}
}
}
funcmain(){
gohttp.ListenAndServe("0.0.0.0:8080",nil)
cpuhogger()
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 20/84
Visualization
$gotoolpprofhttp://localhost:8080/debug/pprof/profile?seconds=5
(pprof)web
(pprof)top
4.99sof4.99stotal( 100%)
flat flat% sum% cum cum%
4.99s 100% 100% 4.99s 100% main.cpuhogger
0 0% 100% 4.99s 100% runtime.goexit
0 0% 100% 4.99s 100% runtime.main
(pprof)listcpuhogger
Total:4.99s
Nosourceinformationformain.cpuhogger
No disassembly? No source code? We need binary.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 21/84
Visualization
$gotoolpprofpproftesthttp://localhost:8080/debug/pprof/profile?seconds=5
(pprof)listcpuhogger
Total:4.97s
ROUTINE========================main.cpuhoggerin/home/marko/goprojects/src/github.com/mkevac/pproft
4.97s 4.97s(flat,cum) 100%ofTotal
. . 6:)
. . 7:
. . 8:funccpuhogger(){
. . 9: varaccuint64
. . 10: for{
2.29s 2.29s 11: acc+=1
1.14s 1.14s 12: ifacc&1==0{
1.54s 1.54s 13: acc<<=1
. . 14: }
. . 15: }
. . 16:}
. . 17:
. . 18:funcmain(){
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 22/84
Visualization
(pprof)disasmcpuhogger
Total:4.97s
ROUTINE========================main.cpuhogger
4.97s 4.97s(flat,cum) 100%ofTotal
. . 401000:XORLAX,AX
1.75s 1.75s 401002:INCQAX
1.14s 1.14s 401005:TESTQ$0x1,AX
. . 40100b:JNE0x401002
1.54s 1.54s 40100d:SHLQ$0x1,AX
540ms 540ms 401010:JMP0x401002
. . 401012:INT$0x3
Why? Let's dig deeper.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 23/84
Why?
$curlhttp://localhost:8080/debug/pprof/profile?seconds=5-o/tmp/cpu.log
$strings/tmp/cpu.log|grepcpuhogger
/debug/pprof/symbol for acquiring symbols
binary for disassembly
binary and source code for source code
Currently there is no way to specify path to source code (same as "dir" command in
gdb) :-(
Binary that you give to pprof and binary that is running must be the same!
Not deep enough?
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 24/84
How pprof works?
1. Current desktop and server OS's implement preemptive scheduling
(https://en.wikipedia.org/wiki/Preemption_(computing))or preemptive multitasking (oposing to cooperative
multitasking).
2. Hardware sends signal to OS and OS executes scheduler which can preempt
working process and put other process on it's place.
3. pprof works in similar fashion.
4. man setitimer(http://man7.org/linux/man-pages/man2/setitimer.2.html)and SIGPROF
5. Go sets handler for SIGPROF which gets and saves stack traces for all
goroutines/threads.
6. Separate goroutine gives this data to user.
Bug in SIGPROF signal delivery(http://research.swtch.com/macpprof)was the reason why pro ling on OSX
pre El Capitain did not work.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 25/84
How pprof works?
Cons
1. Signals are not cheap. Do not expect more than 500 signals per second. Default
frequency in Go runtime is 100 HZ.
2. In non standard builds (-buildmode=c-archive or -buildmode=c-shared) pro ler do
not work by default.
3. User space process do not have access to kernel stack trace.
Pros
Go runtime has all the knowledge about internal stu .
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 26/84
Linux system pro lers
varhaystack=`Loremipsumdolorsitamet...auctor...elit...`
funcUsingSubstring()bool{
found:=strings.Contains(haystack,"auctor")
returnfound
}
funcUsingRegex()bool{
found,_:=regexp.MatchString("auctor",haystack)
returnfound
}
funcmain(){
gofunc(){
for{
UsingSubstring()
}
}()
for{
UsingRegex()
}
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 27/84
Systemtap
Systemtap script -> C code -> Kernel module
stap utility do all these things for you. Including kernel module loading and unloading.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 28/84
Systemtap
Getting probe list:
$stap-l'process("systemtap").function("main.*")'
process("systemtap").function("main.UsingRegex@main.go:16")
process("systemtap").function("main.UsingSubstring@main.go:11")
process("systemtap").function("main.init@main.go:32")
process("systemtap").function("main.main.func1@main.go:22")
process("systemtap").function("main.main@main.go:21")
Getting probe list with function arguments
$stap-L'process("systemtap").function("runtime.mallocgc")'
process("systemtap").function("runtime.mallocgc@src/runtime/malloc.go:553")
$shouldhelpgc:bool$noscan:bool$scanSize:uintptr$dataSize:uintptr$x:void*$s:structruntime.mspan*
runtime.g*$size:uintptr$typ:runtime._type*$needzero:bool$~r3:void*
Systemtap do not understand where Go keeps return value, so we can get in
manually:
printf("%dn",user_int64(register("rsp")+8))
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 29/84
Systemtap
globaletime
globalintervals
probe$1.call {
etime=gettimeofday_ns()
}
probe$1.return{
intervals<<<(gettimeofday_ns()-etime)/1000
}
probeend{
printf("Durationmin:%dusavg:%dusmax:%duscount:%dn",
@min(intervals),@avg(intervals),@max(intervals),
@count(intervals))
printf("Duration(us):n")
print(@hist_log(intervals));
printf("n")
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 30/84
Systemtap
$sudostapmain.stap'process("systemtap").function("main.UsingSubstring")'
^CDurationmin:0usavg:1usmax:586uscount:1628362
Duration(us):
value|--------------------------------------------------count
0| 10
1|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1443040
2|@@@@@ 173089
4| 6982
8| 4321
16| 631
32| 197
64| 74
128| 13
256| 4
512| 1
1024| 0
2048| 0
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 31/84
Systemtap
$./systemtap
runtime:unexpectedreturnpcformain.UsingSubstringcalledfrom0x7fffffffe000
fatalerror:unknowncallerpc
runtimestack:
runtime.throw(0x494e40,0x11)
/home/marko/go/src/runtime/panic.go:566+0x8b
runtime.gentraceback(0xffffffffffffffff,0xc8200337a8,0x0,0xc820001d40,0x0,0x0,0x7fffffff,0x7fff
/home/marko/go/src/runtime/traceback.go:311+0x138c
runtime.scanstack(0xc820001d40)
/home/marko/go/src/runtime/mgcmark.go:755+0x249
runtime.scang(0xc820001d40)
/home/marko/go/src/runtime/proc.go:836+0x132
runtime.markroot.func1()
/home/marko/go/src/runtime/mgcmark.go:234+0x55
runtime.systemstack(0x4e4f00)
/home/marko/go/src/runtime/asm_amd64.s:298+0x79
runtime.mstart()
/home/marko/go/src/runtime/proc.go:1087
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 32/84
Systemtap
Crash when Go's garbage collector gets its call trace.
Probably caused by trampoline that systemtap puts in our code to handle its probes.
goo.gl/N8XH3p(https://goo.gl/N8XH3p)
No x yet.
But Go is not alone. There are problems with uretprobes trampoline in C++ too
(https://sourceware.org/bugzilla/show_bug.cgi?id=12275)(2010-)
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 33/84
Systemtap
packagemain
import(
"bytes"
"fmt"
"math/rand"
"time"
)
funcToString(numberint)string{
returnfmt.Sprintf("%d",number)
}
funcmain(){
r:=rand.New(rand.NewSource(time.Now().UnixNano()))
varbufbytes.Buffer
fori:=0;i<1000;i++{
value:=r.Int()%1000
value=value-500
buf.WriteString(ToString(value))
}
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 34/84
Systemtap
globalintervals
probeprocess("systemtap02").function("main.ToString").call {
intervals<<<$number
}
probeend{
printf("Variablesmin:%dusavg:%dusmax:%duscount:%dn",
@min(intervals),@avg(intervals),@max(intervals),
@count(intervals))
printf("Variables:n")
print(@hist_log(intervals));
printf("n")
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 35/84
Systemtap
Variablesmin:-499usavg:8usmax:497uscount:1000
Variables:
value|--------------------------------------------------count
-256|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 249
-128|@@@@@@@@@@@@@@@@@@@@ 121
-64|@@@@@@@@@@ 60
-32|@@@@@@ 36
-16|@@ 12
-8|@ 8
-4| 5
-2| 3
-1| 2
0| 2
1| 2
2| 3
4|@ 7
8| 4
16|@@@ 20
32|@@@@@ 33
64|@@@@@@@ 44
128|@@@@@@@@@@@@@@@@@@ 110
256|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 279
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 36/84
perf and perf_events
$sudoperftop-p$(pidofsystemtap)
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 37/84
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 38/84
perf and perf_events
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 39/84
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 40/84
Brendan Gregg Flame Graphs
www.brendangregg.com/ amegraphs.html(http://www.brendangregg.com/ amegraphs.html)
Systems Performance: Enterprise and the Cloud
goo.gl/556Hs2(http://goo.gl/556Hs2)
$sudoperfrecord-F99-g-p$(pidofsystemtap)--sleep10
[perfrecord:Wokenup1timestowritedata]
[perfrecord:Capturedandwrote0.149MBperf.data(1719samples)]
$sudoperfscript|~/tmp/FlameGraph/stackcollapse-perf.pl>out.perf-folded
$~/tmp/FlameGraph/flamegraph.plout.perf-folded>perf-kernel.svg
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 41/84
Brendan Gregg Flame Graphs
Kernel stack traces!
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 42/84
Memory
What if we were in C/C++ world? Valgrind! Massif!
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
intmain(){
constsize_tMB=1024*1024;
constunsignedcount=20;
char**buf=calloc(count,sizeof(*buf));
for(unsignedi=0;i<count;i++){
buf[i]=calloc(1,MB);
memset(buf[i],0xFF,MB);
sleep(1);
}
for(unsignedi=0;i<count;i++){
free(buf[i]);
sleep(1);
}
free(buf);
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 43/84
Vagrind and Massif
26.20^ ::
| :::#
| @@::#::
| ::@::#:::
| :::::@::#::::::
| ::::@::#::::::
| ::::::@::#::::::
| :::::::::@::#::::::::::
| :::::::::@::#::::::::@@
| ::::::::::@::#::::::::@::
| ::@:::::::::@::#::::::::@::::
| :::@:::::::::@::#::::::::@:::::
| ::::@:::::::::@::#::::::::@::::::
| :::::@:::::::::@::#::::::::@:::::::
|::::::::@:::::::::@::#::::::::@::::::::::
|:::::::@:::::::::@::#::::::::@:::::::::
|@::::::@:::::::::@::#::::::::@:::::::::@
|@::::::@:::::::::@::#::::::::@:::::::::@
|@::::::@:::::::::@::#::::::::@:::::::::@
|@::::::@:::::::::@::#::::::::@:::::::::@
0+----------------------------------------------------------------------->s
0 39.13
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 44/84
Valgrind and Massif
Valgrind rede nes all memory allocation functions (malloc, calloc, new, free, etc.).
Go do not use them. Go has their own memory allocator which uses mmap or sbrk.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 45/84
Memory
Valgrind can catch mmap/sbrk, but there is no point.
All other memory pro ling tools work in the same fashion.
We can theoretically use perf/systemtap
Or we can use rich internal tools
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 46/84
Memory
Go can collect information about allocations with some rate (once in 512KiB by
default).
pprof can visualize it.
Similar to CPU pro ling, we have three ways to collect data. Let's use net/http/pprof
this time.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 47/84
Example
import_"net/http/pprof"
funcallocAndKeep(){
varb[][]byte
for{
b=append(b,make([]byte,1024))
time.Sleep(time.Millisecond)
}
}
funcallocAndLeave(){
varb[][]byte
for{
b=append(b,make([]byte,1024))
iflen(b)==20{
b=nil
}
time.Sleep(time.Millisecond)
}
}
funcmain(){
goallocAndKeep()
goallocAndLeave()
http.ListenAndServe("0.0.0.0:8080",nil)
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 48/84
go tool pprof
alloc_space - allocated bytes
alloc_objects - number of allocated objects
inuse_space - allocated bytes that are in use (live)
inuse_objects - number of allocated objects that are in use (live)
We expect inuse to show only allocAndKeep() and alloc to show both functions.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 49/84
go tool pprof
$gotoolpprof-inuse_spacememtesthttp://localhost:8080/debug/pprof/heap
Fetchingprofilefromhttp://localhost:8080/debug/pprof/heap
Savedprofilein/home/marko/pprof/pprof.memtest.localhost:8080.inuse_objects.inuse_space.005.pb.gz
Enteringinteractivemode(type"help"forcommands)
(pprof)top
15.36MBof15.36MBtotal( 100%)
Dropped2nodes(cum<=0.08MB)
flat flat% sum% cum cum%
15.36MB 100% 100% 15.36MB 100% main.allocAndKeep
0 0% 100% 15.36MB 100% runtime.goexit
$gotoolpprof-alloc_spacememtesthttp://localhost:8080/debug/pprof/heap
Fetchingprofilefromhttp://localhost:8080/debug/pprof/heap
Savedprofilein/home/marko/pprof/pprof.memtest.localhost:8080.alloc_objects.alloc_space.008.pb.gz
Enteringinteractivemode(type"help"forcommands)
(pprof)top
54.49MBof54.49MBtotal( 100%)
Dropped8nodes(cum<=0.27MB)
flat flat% sum% cum cum%
27.97MB51.33%51.33% 29.47MB54.08% main.allocAndKeep
23.52MB43.17%94.49% 25.02MB45.92% main.allocAndLeave
3MB 5.51% 100% 3MB 5.51% time.Sleep
0 0% 100% 54.49MB 100% runtime.goexit
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 50/84
Sleep?
Looks like predicted. But what is with sleep?
(pprof)listtime.Sleep
Total:54.49MB
ROUTINE========================time.Sleepin/home/marko/go/src/runtime/time.go
3MB 3MB(flat,cum) 5.51%ofTotal
. . 48:functimeSleep(nsint64){
. . 49: ifns<=0{
. . 50: return
. . 51: }
. . 52:
3MB 3MB 53: t:=new(timer)
. . 54: t.when=nanotime()+ns
. . 55: t.f=goroutineReady
. . 56: t.arg=getg()
. . 57: lock(&timers.lock)
. . 58: addtimerLocked(t)
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 51/84
Implicit allocations
packageprinttest
import(
"bytes"
"fmt"
"testing"
)
funcBenchmarkPrint(b*testing.B){
varbufbytes.Buffer
varsstring="teststring"
fori:=0;i<b.N;i++{
buf.Reset()
fmt.Fprintf(&buf,"stringis:%s",s)
}
}
Benchmark?
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 52/84
Benchmark
$gotest-bench=.-benchmem
testing:warning:noteststorun
BenchmarkPrint-8 10000000 128ns/op 16B/op 1allocs/op
PASS
ok github.com/mkevac/converttest 1.420s
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 53/84
Pro ling
$gotest-bench=.-memprofile=mem.out-memprofilerate=1
mempro lerate sets pro ling rate. 1 means all allocations.
$ go tool pprof -alloc_space converttest.test mem.out
(pprof)top
15.41MBof15.48MBtotal(99.59%)
Dropped73nodes(cum<=0.08MB)
flat flat% sum% cum cum%
15.41MB99.59%99.59% 15.43MB99.67% github.com/mkevac/converttest.BenchmarkPrint
0 0%99.59% 15.47MB99.93% runtime.goexit
0 0%99.59% 15.42MB99.66% testing.(*B).launch
0 0%99.59% 15.43MB99.67% testing.(*B).runN
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 54/84
Pro ling
(pprof)listBenchmarkPrint
Total:15.48MB
ROUTINE========================github.com/mkevac/converttest.BenchmarkPrintin/home/marko/goproject
15.41MB 15.43MB(flat,cum)99.67%ofTotal
. . 9:funcBenchmarkPrint(b*testing.B){
. . 10: varbufbytes.Buffer
. . 11: varsstring="teststring"
. . 12: fori:=0;i<b.N;i++{
. . 13: buf.Reset()
15.41MB 15.43MB 14: fmt.Fprintf(&buf,"stringis:%s",s)
. . 15: }
. . 16:}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 55/84
Pro ling
(pprof)listfmt.Fprintf
Total:15.48MB
ROUTINE========================fmt.Fprintfin/home/marko/go/src/fmt/print.go
0 12.02kB(flat,cum)0.076%ofTotal
. . 175://Theseroutinesendin'f'andtakeaformatstring.
. . 176:
. . 177://Fprintfformatsaccordingtoaformatspecifierandwritestow.
. . 178://Itreturnsthenumberofbyteswrittenandanywriteerrorencountered.
. . 179:funcFprintf(wio.Writer,formatstring,a...interface{})(nint,errerror)
. 11.55kB 180: p:=newPrinter()
. 480B 181: p.doPrintf(format,a)
. . 182: n,err=w.Write(p.buf)
. . 183: p.free()
. . 184: return
. . 185:}
. . 186:
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 56/84
Disassembly
. . 466edb:CALLbytes.(*Buffer).Reset(SB)
. . 466ee0:LEAQ0x98b6b(IP),AX
. . 466ee7:MOVQAX,0x70(SP)
. . 466eec:MOVQ$0xb,0x78(SP)
. . 466ef5:MOVQ$0x0,0x60(SP)
. . 466efe:MOVQ$0x0,0x68(SP)
. . 466f07:LEAQ0x70d92(IP),AX
. . 466f0e:MOVQAX,0(SP)
. . 466f12:LEAQ0x70(SP),AX
. . 466f17:MOVQAX,0x8(SP)
. . 466f1c:MOVQ$0x0,0x10(SP)
15.41MB 15.41MB 466f25:CALLruntime.convT2E(SB)
. . 466f2a:MOVQ0x18(SP),AX
. . 466f2f:MOVQ0x20(SP),CX
. . 466f34:MOVQAX,0x60(SP)
. . 466f39:MOVQCX,0x68(SP)
. . 466f3e:LEAQ0x10b35b(IP),AX
. . 466f45:MOVQAX,0(SP)
. . 466f49:MOVQ0x58(SP),AX
. . 466f4e:MOVQAX,0x8(SP)
. . 466f53:LEAQ0x99046(IP),CX
. . 466f5a:MOVQCX,0x10(SP)
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 57/84
. . 466f5f:MOVQ$0xd,0x18(SP)
. . 466f68:LEAQ0x60(SP),CX
. . 466f6d:MOVQCX,0x20(SP)
. . 466f72:MOVQ$0x1,0x28(SP)
. . 466f7b:MOVQ$0x1,0x30(SP)
. 12.02kB 466f84:CALLfmt.Fprintf(SB)
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 58/84
fprintf
funcFprintf(wio.Writer,formatstring,a...interface{})(nint,errerror)
interface{} same as void*... but it's not
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 59/84
Go internal types
string, chan, func, slice, interface, etc.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 60/84
Empty interface
varsstring=“marko”
varainterface{}=&s
no allocation
varsstring=“marko”
varainterface{}=s
16 bytes allocation
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 61/84
Empty interface
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 62/84
Fix
packagemain
import(
"bytes"
"testing"
)
funcBenchmarkPrint(b*testing.B){
varbufbytes.Buffer
varsstring="teststring"
fori:=0;i<b.N;i++{
buf.Reset()
buf.WriteString("stringis:")
buf.WriteString(s)
}
}
Benchmark?
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 63/84
Benchmark
$gotest-bench=BenchmarkPrint-benchmem
testing:warning:noteststorun
BenchmarkPrint-8 50000000 27.5ns/op 0B/op 0allocs/op
PASS
ok github.com/mkevac/converttest01 1.413s
0 allocations and 4x speed
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 64/84
Implicit allocation
String and char * pretty much the same in C. But not in Go.
packagemain
import(
"fmt"
)
funcmain(){
vararray=[]byte{'m','a','r','k','o'}
ifstring(array)=="marko"{
fmt.Println("equal")
}
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 65/84
Implicit allocation
Always check your assumptions.
Go runtime, Go compiler and Go tools are better with each day.
Some optimization you read about in 2010 could be not needed. Or can be harmful.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 66/84
Example (again)
packagemain
import(
"bytes"
"testing"
"unsafe"
)
varsstring
funcBenchmarkConvert(b*testing.B){
varbufbytes.Buffer
vararray=[]byte{'m','a','r','k','o',0}
fori:=0;i<b.N;i++{
buf.Reset()
s=string(array)
buf.WriteString(s)
}
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 67/84
Benchmark
$gotest-bench=.-benchmem
testing:warning:noteststorun
BenchmarkConvert-8 30000000 42.1ns/op 8B/op 1allocs/op
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 68/84
Fix
funcBytesToString(b[]byte)string{
bh:=(*reflect.SliceHeader)(unsafe.Pointer(&b))
sh:=reflect.StringHeader{bh.Data,bh.Len}
return*(*string)(unsafe.Pointer(&sh))
}
funcBenchmarkNoConvert(b*testing.B){
varbufbytes.Buffer
vararray=[]byte{'m','a','r','k','o',0}
fori:=0;i<b.N;i++{
buf.Reset()
s=BytesToString(array)
buf.WriteString(s)
}
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 69/84
Benchmark
$gotest-bench=.-benchmem
testing:warning:noteststorun
BenchmarkConvert-8 30000000 44.5ns/op 8B/op 1allocs/op
BenchmarkNoConvert-8 100000000 19.2ns/op 0B/op 0allocs/op
PASS
ok github.com/mkevac/bytetostring 3.332s
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 70/84
Tracing
Go runtime writes almost everything it does.
Scheduling, channel operations, locks, thread creation, ...
Full list in runtime/trace.go
For visualization go tool trace uses same JS package that Chrome uses for page loading
visualization.
Example.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 71/84
debugcharts
github.com/mkevac/debugcharts(http://github.com/mkevac/debugcharts)
runtime.ReadMemStats() once a second
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 72/84
Example
import(
"net/http"
_"net/http/pprof"
"time"
_"github.com/mkevac/debugcharts"
)
funcCPUHogger(){
varaccuint64
t:=time.Tick(2*time.Second)
for{
select{
case<-t:
time.Sleep(50*time.Millisecond)
default:
acc++
}
}
}
funcmain(){
goCPUHogger()
goCPUHogger()
http.ListenAndServe("0.0.0.0:8181",nil)
}
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 73/84
Tracing
$curlhttp://localhost:8181/debug/pprof/trace?seconds=10-otrace.out
Sometimes all you can visualize is 1-3 seconds.
$gotooltrace-http"0.0.0.0:8080"./tracetesttrace.out
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 74/84
Tracing
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 75/84
Tracing
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 76/84
Tracing
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 77/84
proc stop and proc start
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 78/84
runtime.ReadMemStats()
180//ReadMemStatspopulatesmwithmemoryallocatorstatistics.
181funcReadMemStats(m*MemStats){
182 stopTheWorld("readmemstats")
183
184 systemstack(func(){
185 readmemstats_m(m)
186 })
187
188 startTheWorld()
189}
Production? No!
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 79/84
Conclusion
There are so much more
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 80/84
Conlusion
CPU pro ler
Memory pro ler
All allocations tracing
Escape analysis
Lock/Contention pro ler
Scheduler tracing
Tracing
GC tracing
Real time memory statistics
System pro lers like perf and systemtap.
But no tool will replace deep understanding of how your program works from start to
nish.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 81/84
I hope that today's crash course was helpful.
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 82/84
Stay curious
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 83/84
Thank you
Marko Kevac
Software Engineer, Badoo
marko@kevac.org(mailto:marko@kevac.org)
@mkevac(http://twitter.com/mkevac)
5/12/2016 Profiling and optimizing Go programs
http://localhost:3999/gomeetup.slide#1 84/84
Ad

More Related Content

What's hot (20)

Job Queue in Golang
Job Queue in GolangJob Queue in Golang
Job Queue in Golang
Bo-Yi Wu
 
The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014
Jian-Hong Pan
 
Haproxy - zastosowania
Haproxy - zastosowaniaHaproxy - zastosowania
Haproxy - zastosowania
Łukasz Jagiełło
 
nouka inventry manager
nouka inventry managernouka inventry manager
nouka inventry manager
Toshiaki Baba
 
TomcatCon: from a cluster to the cloud
TomcatCon: from a cluster to the cloudTomcatCon: from a cluster to the cloud
TomcatCon: from a cluster to the cloud
Jean-Frederic Clere
 
How to inspect a RUNNING perl process
How to inspect a RUNNING perl processHow to inspect a RUNNING perl process
How to inspect a RUNNING perl process
Masaaki HIROSE
 
C++17 now
C++17 nowC++17 now
C++17 now
corehard_by
 
Odoo Online platform: architecture and challenges
Odoo Online platform: architecture and challengesOdoo Online platform: architecture and challenges
Odoo Online platform: architecture and challenges
Odoo
 
Puppet
PuppetPuppet
Puppet
Łukasz Jagiełło
 
The origin: Init (compact version)
The origin: Init (compact version)The origin: Init (compact version)
The origin: Init (compact version)
Tzung-Bi Shih
 
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDevMake Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
Jian-Hong Pan
 
Node.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitterNode.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitter
Simen Li
 
What is new in Go 1.8
What is new in Go 1.8What is new in Go 1.8
What is new in Go 1.8
John Hua
 
Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!
Michael Barker
 
Event loop
Event loopEvent loop
Event loop
codepitbull
 
A little systemtap
A little systemtapA little systemtap
A little systemtap
yang bingwu
 
Tomcat from a cluster to the cloud on RP3
Tomcat from a cluster to the cloud on RP3Tomcat from a cluster to the cloud on RP3
Tomcat from a cluster to the cloud on RP3
Jean-Frederic Clere
 
Rapid Application Design in Financial Services
Rapid Application Design in Financial ServicesRapid Application Design in Financial Services
Rapid Application Design in Financial Services
Aerospike
 
Object Storage with Gluster
Object Storage with GlusterObject Storage with Gluster
Object Storage with Gluster
Gluster.org
 
Refactoring for testability c++
Refactoring for testability c++Refactoring for testability c++
Refactoring for testability c++
Dimitrios Platis
 
Job Queue in Golang
Job Queue in GolangJob Queue in Golang
Job Queue in Golang
Bo-Yi Wu
 
The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014
Jian-Hong Pan
 
nouka inventry manager
nouka inventry managernouka inventry manager
nouka inventry manager
Toshiaki Baba
 
TomcatCon: from a cluster to the cloud
TomcatCon: from a cluster to the cloudTomcatCon: from a cluster to the cloud
TomcatCon: from a cluster to the cloud
Jean-Frederic Clere
 
How to inspect a RUNNING perl process
How to inspect a RUNNING perl processHow to inspect a RUNNING perl process
How to inspect a RUNNING perl process
Masaaki HIROSE
 
Odoo Online platform: architecture and challenges
Odoo Online platform: architecture and challengesOdoo Online platform: architecture and challenges
Odoo Online platform: architecture and challenges
Odoo
 
The origin: Init (compact version)
The origin: Init (compact version)The origin: Init (compact version)
The origin: Init (compact version)
Tzung-Bi Shih
 
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDevMake Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
Jian-Hong Pan
 
Node.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitterNode.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitter
Simen Li
 
What is new in Go 1.8
What is new in Go 1.8What is new in Go 1.8
What is new in Go 1.8
John Hua
 
Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!
Michael Barker
 
A little systemtap
A little systemtapA little systemtap
A little systemtap
yang bingwu
 
Tomcat from a cluster to the cloud on RP3
Tomcat from a cluster to the cloud on RP3Tomcat from a cluster to the cloud on RP3
Tomcat from a cluster to the cloud on RP3
Jean-Frederic Clere
 
Rapid Application Design in Financial Services
Rapid Application Design in Financial ServicesRapid Application Design in Financial Services
Rapid Application Design in Financial Services
Aerospike
 
Object Storage with Gluster
Object Storage with GlusterObject Storage with Gluster
Object Storage with Gluster
Gluster.org
 
Refactoring for testability c++
Refactoring for testability c++Refactoring for testability c++
Refactoring for testability c++
Dimitrios Platis
 

Viewers also liked (20)

Reform: путь к лучшему ORM
Reform: путь к лучшему ORMReform: путь к лучшему ORM
Reform: путь к лучшему ORM
Badoo Development
 
Семь тысяч Rps, один go
Семь тысяч Rps, один goСемь тысяч Rps, один go
Семь тысяч Rps, один go
Badoo Development
 
«Миллион открытых каналов с данными по сети» – Илья Биин (Zenhotels)
«Миллион открытых каналов с данными по сети» – Илья Биин (Zenhotels)«Миллион открытых каналов с данными по сети» – Илья Биин (Zenhotels)
«Миллион открытых каналов с данными по сети» – Илья Биин (Zenhotels)
AvitoTech
 
Golang в avito
Golang в avitoGolang в avito
Golang в avito
AvitoTech
 
«Как 200 строк на Go помогли нам освободить 15 серверов» – Паша Мурзаков (Badoo)
«Как 200 строк на Go помогли нам освободить 15 серверов» – Паша Мурзаков (Badoo)«Как 200 строк на Go помогли нам освободить 15 серверов» – Паша Мурзаков (Badoo)
«Как 200 строк на Go помогли нам освободить 15 серверов» – Паша Мурзаков (Badoo)
AvitoTech
 
TechLeads meetup: Андрей Шелёхин, Tinkoff.ru
TechLeads meetup: Андрей Шелёхин, Tinkoff.ruTechLeads meetup: Андрей Шелёхин, Tinkoff.ru
TechLeads meetup: Андрей Шелёхин, Tinkoff.ru
Badoo Development
 
TechLeads meetup: Евгений Потапов, ITSumma
TechLeads meetup: Евгений Потапов, ITSumma TechLeads meetup: Евгений Потапов, ITSumma
TechLeads meetup: Евгений Потапов, ITSumma
Badoo Development
 
TechLeads meetup: Макс Лапшин, Erlyvideo
TechLeads meetup: Макс Лапшин, ErlyvideoTechLeads meetup: Макс Лапшин, Erlyvideo
TechLeads meetup: Макс Лапшин, Erlyvideo
Badoo Development
 
TechLeads meetup: Алексей Рыбак, Badoo
TechLeads meetup: Алексей Рыбак, BadooTechLeads meetup: Алексей Рыбак, Badoo
TechLeads meetup: Алексей Рыбак, Badoo
Badoo Development
 
Парсим CSS
Парсим CSSПарсим CSS
Парсим CSS
Badoo Development
 
Что надо знать о HTTP/2
Что надо знать о HTTP/2Что надо знать о HTTP/2
Что надо знать о HTTP/2
Badoo Development
 
Классическое программирование для фронтендеров
Классическое программирование для фронтендеровКлассическое программирование для фронтендеров
Классическое программирование для фронтендеров
Badoo Development
 
S.O.L.I.D-ый JavaScript
S.O.L.I.D-ый JavaScriptS.O.L.I.D-ый JavaScript
S.O.L.I.D-ый JavaScript
Badoo Development
 
Как мы общаемся с пользователями на 46 языках и понимаем друг друга
Как мы общаемся с пользователями на 46 языках и понимаем друг другаКак мы общаемся с пользователями на 46 языках и понимаем друг друга
Как мы общаемся с пользователями на 46 языках и понимаем друг друга
Badoo Development
 
"Геолокация в Badoo", Андрей Воликов (Badoo)
"Геолокация в Badoo", Андрей Воликов (Badoo)"Геолокация в Badoo", Андрей Воликов (Badoo)
"Геолокация в Badoo", Андрей Воликов (Badoo)
Badoo Development
 
"Новые возможности MySQL 5.7"
"Новые возможности MySQL 5.7""Новые возможности MySQL 5.7"
"Новые возможности MySQL 5.7"
Badoo Development
 
Docker networking
Docker networkingDocker networking
Docker networking
Badoo Development
 
"Обзор Tarantool DB"
"Обзор Tarantool DB""Обзор Tarantool DB"
"Обзор Tarantool DB"
Badoo Development
 
"PostgreSQL для разработчиков приложений", Павел Лузанов, (Постгрес Профессио...
"PostgreSQL для разработчиков приложений", Павел Лузанов, (Постгрес Профессио..."PostgreSQL для разработчиков приложений", Павел Лузанов, (Постгрес Профессио...
"PostgreSQL для разработчиков приложений", Павел Лузанов, (Постгрес Профессио...
Badoo Development
 
"Производительность MySQL: что нового?"
"Производительность MySQL: что нового?""Производительность MySQL: что нового?"
"Производительность MySQL: что нового?"
Badoo Development
 
Reform: путь к лучшему ORM
Reform: путь к лучшему ORMReform: путь к лучшему ORM
Reform: путь к лучшему ORM
Badoo Development
 
Семь тысяч Rps, один go
Семь тысяч Rps, один goСемь тысяч Rps, один go
Семь тысяч Rps, один go
Badoo Development
 
«Миллион открытых каналов с данными по сети» – Илья Биин (Zenhotels)
«Миллион открытых каналов с данными по сети» – Илья Биин (Zenhotels)«Миллион открытых каналов с данными по сети» – Илья Биин (Zenhotels)
«Миллион открытых каналов с данными по сети» – Илья Биин (Zenhotels)
AvitoTech
 
Golang в avito
Golang в avitoGolang в avito
Golang в avito
AvitoTech
 
«Как 200 строк на Go помогли нам освободить 15 серверов» – Паша Мурзаков (Badoo)
«Как 200 строк на Go помогли нам освободить 15 серверов» – Паша Мурзаков (Badoo)«Как 200 строк на Go помогли нам освободить 15 серверов» – Паша Мурзаков (Badoo)
«Как 200 строк на Go помогли нам освободить 15 серверов» – Паша Мурзаков (Badoo)
AvitoTech
 
TechLeads meetup: Андрей Шелёхин, Tinkoff.ru
TechLeads meetup: Андрей Шелёхин, Tinkoff.ruTechLeads meetup: Андрей Шелёхин, Tinkoff.ru
TechLeads meetup: Андрей Шелёхин, Tinkoff.ru
Badoo Development
 
TechLeads meetup: Евгений Потапов, ITSumma
TechLeads meetup: Евгений Потапов, ITSumma TechLeads meetup: Евгений Потапов, ITSumma
TechLeads meetup: Евгений Потапов, ITSumma
Badoo Development
 
TechLeads meetup: Макс Лапшин, Erlyvideo
TechLeads meetup: Макс Лапшин, ErlyvideoTechLeads meetup: Макс Лапшин, Erlyvideo
TechLeads meetup: Макс Лапшин, Erlyvideo
Badoo Development
 
TechLeads meetup: Алексей Рыбак, Badoo
TechLeads meetup: Алексей Рыбак, BadooTechLeads meetup: Алексей Рыбак, Badoo
TechLeads meetup: Алексей Рыбак, Badoo
Badoo Development
 
Что надо знать о HTTP/2
Что надо знать о HTTP/2Что надо знать о HTTP/2
Что надо знать о HTTP/2
Badoo Development
 
Классическое программирование для фронтендеров
Классическое программирование для фронтендеровКлассическое программирование для фронтендеров
Классическое программирование для фронтендеров
Badoo Development
 
Как мы общаемся с пользователями на 46 языках и понимаем друг друга
Как мы общаемся с пользователями на 46 языках и понимаем друг другаКак мы общаемся с пользователями на 46 языках и понимаем друг друга
Как мы общаемся с пользователями на 46 языках и понимаем друг друга
Badoo Development
 
"Геолокация в Badoo", Андрей Воликов (Badoo)
"Геолокация в Badoo", Андрей Воликов (Badoo)"Геолокация в Badoo", Андрей Воликов (Badoo)
"Геолокация в Badoo", Андрей Воликов (Badoo)
Badoo Development
 
"Новые возможности MySQL 5.7"
"Новые возможности MySQL 5.7""Новые возможности MySQL 5.7"
"Новые возможности MySQL 5.7"
Badoo Development
 
"PostgreSQL для разработчиков приложений", Павел Лузанов, (Постгрес Профессио...
"PostgreSQL для разработчиков приложений", Павел Лузанов, (Постгрес Профессио..."PostgreSQL для разработчиков приложений", Павел Лузанов, (Постгрес Профессио...
"PostgreSQL для разработчиков приложений", Павел Лузанов, (Постгрес Профессио...
Badoo Development
 
"Производительность MySQL: что нового?"
"Производительность MySQL: что нового?""Производительность MySQL: что нового?"
"Производительность MySQL: что нового?"
Badoo Development
 
Ad

Similar to Profiling and optimizing go programs (20)

carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-API
Yoni Davidson
 
Optimizing and Profiling Golang Rest Api
Optimizing and Profiling Golang Rest ApiOptimizing and Profiling Golang Rest Api
Optimizing and Profiling Golang Rest Api
Iman Syahputra Situmorang
 
Debugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDBDebugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDB
bmbouter
 
Go profiling introduction
Go profiling introductionGo profiling introduction
Go profiling introduction
William Lin
 
Why you should care about Go (Golang)
Why you should care about Go (Golang)Why you should care about Go (Golang)
Why you should care about Go (Golang)
Aaron Schlesinger
 
GroongaアプリケーションをDockerコンテナ化して配布する
GroongaアプリケーションをDockerコンテナ化して配布するGroongaアプリケーションをDockerコンテナ化して配布する
GroongaアプリケーションをDockerコンテナ化して配布する
ongaeshi
 
Continuous Go Profiling & Observability
Continuous Go Profiling & ObservabilityContinuous Go Profiling & Observability
Continuous Go Profiling & Observability
ScyllaDB
 
Write microservice in golang
Write microservice in golangWrite microservice in golang
Write microservice in golang
Bo-Yi Wu
 
GopherCon IL 2020 - Web Application Profiling 101
GopherCon IL 2020 - Web Application Profiling 101GopherCon IL 2020 - Web Application Profiling 101
GopherCon IL 2020 - Web Application Profiling 101
yinonavraham
 
Where's the source, Luke? : How to find and debug the code behind Plone
Where's the source, Luke? : How to find and debug the code behind PloneWhere's the source, Luke? : How to find and debug the code behind Plone
Where's the source, Luke? : How to find and debug the code behind Plone
Vincenzo Barone
 
Linux Security and How Web Browser Sandboxes Really Work (NDC Oslo 2017)
Linux Security  and How Web Browser Sandboxes Really Work (NDC Oslo 2017)Linux Security  and How Web Browser Sandboxes Really Work (NDC Oslo 2017)
Linux Security and How Web Browser Sandboxes Really Work (NDC Oslo 2017)
Patricia Aas
 
Ratpack - Classy and Compact Groovy Web Apps
Ratpack - Classy and Compact Groovy Web AppsRatpack - Classy and Compact Groovy Web Apps
Ratpack - Classy and Compact Groovy Web Apps
James Williams
 
Go - techniques for writing high performance Go applications
Go - techniques for writing high performance Go applicationsGo - techniques for writing high performance Go applications
Go - techniques for writing high performance Go applications
ss63261
 
Introduction to Google Colaboratory.pdf
Introduction to Google Colaboratory.pdfIntroduction to Google Colaboratory.pdf
Introduction to Google Colaboratory.pdf
Yomna Mahmoud Ibrahim Hassan
 
Learn flask in 90mins
Learn flask in 90minsLearn flask in 90mins
Learn flask in 90mins
Larry Cai
 
Happy porting x86 application to android
Happy porting x86 application to androidHappy porting x86 application to android
Happy porting x86 application to android
Owen Hsu
 
2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku
ronnywang_tw
 
Hand-on Resources II: Extending SCMSWeb
Hand-on Resources II: Extending SCMSWebHand-on Resources II: Extending SCMSWeb
Hand-on Resources II: Extending SCMSWeb
Sugree Phatanapherom
 
Hands-on go profiling
Hands-on go profilingHands-on go profiling
Hands-on go profiling
Daniel Ammar
 
PHP Development Tools
PHP  Development ToolsPHP  Development Tools
PHP Development Tools
Antony Abramchenko
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-API
Yoni Davidson
 
Debugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDBDebugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDB
bmbouter
 
Go profiling introduction
Go profiling introductionGo profiling introduction
Go profiling introduction
William Lin
 
Why you should care about Go (Golang)
Why you should care about Go (Golang)Why you should care about Go (Golang)
Why you should care about Go (Golang)
Aaron Schlesinger
 
GroongaアプリケーションをDockerコンテナ化して配布する
GroongaアプリケーションをDockerコンテナ化して配布するGroongaアプリケーションをDockerコンテナ化して配布する
GroongaアプリケーションをDockerコンテナ化して配布する
ongaeshi
 
Continuous Go Profiling & Observability
Continuous Go Profiling & ObservabilityContinuous Go Profiling & Observability
Continuous Go Profiling & Observability
ScyllaDB
 
Write microservice in golang
Write microservice in golangWrite microservice in golang
Write microservice in golang
Bo-Yi Wu
 
GopherCon IL 2020 - Web Application Profiling 101
GopherCon IL 2020 - Web Application Profiling 101GopherCon IL 2020 - Web Application Profiling 101
GopherCon IL 2020 - Web Application Profiling 101
yinonavraham
 
Where's the source, Luke? : How to find and debug the code behind Plone
Where's the source, Luke? : How to find and debug the code behind PloneWhere's the source, Luke? : How to find and debug the code behind Plone
Where's the source, Luke? : How to find and debug the code behind Plone
Vincenzo Barone
 
Linux Security and How Web Browser Sandboxes Really Work (NDC Oslo 2017)
Linux Security  and How Web Browser Sandboxes Really Work (NDC Oslo 2017)Linux Security  and How Web Browser Sandboxes Really Work (NDC Oslo 2017)
Linux Security and How Web Browser Sandboxes Really Work (NDC Oslo 2017)
Patricia Aas
 
Ratpack - Classy and Compact Groovy Web Apps
Ratpack - Classy and Compact Groovy Web AppsRatpack - Classy and Compact Groovy Web Apps
Ratpack - Classy and Compact Groovy Web Apps
James Williams
 
Go - techniques for writing high performance Go applications
Go - techniques for writing high performance Go applicationsGo - techniques for writing high performance Go applications
Go - techniques for writing high performance Go applications
ss63261
 
Learn flask in 90mins
Learn flask in 90minsLearn flask in 90mins
Learn flask in 90mins
Larry Cai
 
Happy porting x86 application to android
Happy porting x86 application to androidHappy porting x86 application to android
Happy porting x86 application to android
Owen Hsu
 
2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku
ronnywang_tw
 
Hand-on Resources II: Extending SCMSWeb
Hand-on Resources II: Extending SCMSWebHand-on Resources II: Extending SCMSWeb
Hand-on Resources II: Extending SCMSWeb
Sugree Phatanapherom
 
Hands-on go profiling
Hands-on go profilingHands-on go profiling
Hands-on go profiling
Daniel Ammar
 
Ad

More from Badoo Development (20)

Viktar Karanevich – iOS Parallel Automation
Viktar Karanevich – iOS Parallel AutomationViktar Karanevich – iOS Parallel Automation
Viktar Karanevich – iOS Parallel Automation
Badoo Development
 
Как мы делаем модули PHP в Badoo – Антон Довгаль
Как мы делаем модули PHP в Badoo – Антон ДовгальКак мы делаем модули PHP в Badoo – Антон Довгаль
Как мы делаем модули PHP в Badoo – Антон Довгаль
Badoo Development
 
Григорий Джанелидзе, OK.RU
Григорий Джанелидзе, OK.RUГригорий Джанелидзе, OK.RU
Григорий Джанелидзе, OK.RU
Badoo Development
 
Андрей Сидоров, Яндекс.Браузер
Андрей Сидоров, Яндекс.БраузерАндрей Сидоров, Яндекс.Браузер
Андрей Сидоров, Яндекс.Браузер
Badoo Development
 
Филипп Уваров, Avito
Филипп Уваров, AvitoФилипп Уваров, Avito
Филипп Уваров, Avito
Badoo Development
 
Cocoaheads Meetup / Alex Zimin / Swift magic
Cocoaheads Meetup / Alex Zimin / Swift magicCocoaheads Meetup / Alex Zimin / Swift magic
Cocoaheads Meetup / Alex Zimin / Swift magic
Badoo Development
 
Cocoaheads Meetup / Kateryna Trofimenko / Feature development
Cocoaheads Meetup / Kateryna Trofimenko / Feature developmentCocoaheads Meetup / Kateryna Trofimenko / Feature development
Cocoaheads Meetup / Kateryna Trofimenko / Feature development
Badoo Development
 
Alex Krasheninnikov – Hadoop High Availability
Alex Krasheninnikov – Hadoop High AvailabilityAlex Krasheninnikov – Hadoop High Availability
Alex Krasheninnikov – Hadoop High Availability
Badoo Development
 
Андрей Денисов – В ожидании мониторинга баз данных
Андрей Денисов – В ожидании мониторинга баз данныхАндрей Денисов – В ожидании мониторинга баз данных
Андрей Денисов – В ожидании мониторинга баз данных
Badoo Development
 
Александр Зобнин, Grafana Labs
Александр Зобнин, Grafana LabsАлександр Зобнин, Grafana Labs
Александр Зобнин, Grafana Labs
Badoo Development
 
Илья Аблеев – Zabbix в Badoo: реагируем быстро и качественно
Илья Аблеев – Zabbix в Badoo: реагируем быстро и качественноИлья Аблеев – Zabbix в Badoo: реагируем быстро и качественно
Илья Аблеев – Zabbix в Badoo: реагируем быстро и качественно
Badoo Development
 
Паша Мурзаков: Как 200 строк на Go помогли нам освободить 15 серверов»
Паша Мурзаков: Как 200 строк на Go помогли нам освободить 15 серверов»  Паша Мурзаков: Как 200 строк на Go помогли нам освободить 15 серверов»
Паша Мурзаков: Как 200 строк на Go помогли нам освободить 15 серверов»
Badoo Development
 
Как мы готовим MySQL
 Как мы готовим MySQL  Как мы готовим MySQL
Как мы готовим MySQL
Badoo Development
 
Архитектура хранения и отдачи фотографий в Badoo
Архитектура хранения и отдачи фотографий в Badoo Архитектура хранения и отдачи фотографий в Badoo
Архитектура хранения и отдачи фотографий в Badoo
Badoo Development
 
5 способов деплоя PHP-кода в условиях хайлоада
5 способов деплоя PHP-кода в условиях хайлоада5 способов деплоя PHP-кода в условиях хайлоада
5 способов деплоя PHP-кода в условиях хайлоада
Badoo Development
 
ChromeDriver Jailbreak
ChromeDriver JailbreakChromeDriver Jailbreak
ChromeDriver Jailbreak
Badoo Development
 
Git хуки на страже качества кода
Git хуки на страже качества кодаGit хуки на страже качества кода
Git хуки на страже качества кода
Badoo Development
 
Versioning strategy for a complex internal API
Versioning strategy for a complex internal APIVersioning strategy for a complex internal API
Versioning strategy for a complex internal API
Badoo Development
 
Как мы готовим MySQL
Как мы готовим MySQLКак мы готовим MySQL
Как мы готовим MySQL
Badoo Development
 
Методология: БЭМ, Модули, Отношения
Методология: БЭМ, Модули, ОтношенияМетодология: БЭМ, Модули, Отношения
Методология: БЭМ, Модули, Отношения
Badoo Development
 
Viktar Karanevich – iOS Parallel Automation
Viktar Karanevich – iOS Parallel AutomationViktar Karanevich – iOS Parallel Automation
Viktar Karanevich – iOS Parallel Automation
Badoo Development
 
Как мы делаем модули PHP в Badoo – Антон Довгаль
Как мы делаем модули PHP в Badoo – Антон ДовгальКак мы делаем модули PHP в Badoo – Антон Довгаль
Как мы делаем модули PHP в Badoo – Антон Довгаль
Badoo Development
 
Григорий Джанелидзе, OK.RU
Григорий Джанелидзе, OK.RUГригорий Джанелидзе, OK.RU
Григорий Джанелидзе, OK.RU
Badoo Development
 
Андрей Сидоров, Яндекс.Браузер
Андрей Сидоров, Яндекс.БраузерАндрей Сидоров, Яндекс.Браузер
Андрей Сидоров, Яндекс.Браузер
Badoo Development
 
Филипп Уваров, Avito
Филипп Уваров, AvitoФилипп Уваров, Avito
Филипп Уваров, Avito
Badoo Development
 
Cocoaheads Meetup / Alex Zimin / Swift magic
Cocoaheads Meetup / Alex Zimin / Swift magicCocoaheads Meetup / Alex Zimin / Swift magic
Cocoaheads Meetup / Alex Zimin / Swift magic
Badoo Development
 
Cocoaheads Meetup / Kateryna Trofimenko / Feature development
Cocoaheads Meetup / Kateryna Trofimenko / Feature developmentCocoaheads Meetup / Kateryna Trofimenko / Feature development
Cocoaheads Meetup / Kateryna Trofimenko / Feature development
Badoo Development
 
Alex Krasheninnikov – Hadoop High Availability
Alex Krasheninnikov – Hadoop High AvailabilityAlex Krasheninnikov – Hadoop High Availability
Alex Krasheninnikov – Hadoop High Availability
Badoo Development
 
Андрей Денисов – В ожидании мониторинга баз данных
Андрей Денисов – В ожидании мониторинга баз данныхАндрей Денисов – В ожидании мониторинга баз данных
Андрей Денисов – В ожидании мониторинга баз данных
Badoo Development
 
Александр Зобнин, Grafana Labs
Александр Зобнин, Grafana LabsАлександр Зобнин, Grafana Labs
Александр Зобнин, Grafana Labs
Badoo Development
 
Илья Аблеев – Zabbix в Badoo: реагируем быстро и качественно
Илья Аблеев – Zabbix в Badoo: реагируем быстро и качественноИлья Аблеев – Zabbix в Badoo: реагируем быстро и качественно
Илья Аблеев – Zabbix в Badoo: реагируем быстро и качественно
Badoo Development
 
Паша Мурзаков: Как 200 строк на Go помогли нам освободить 15 серверов»
Паша Мурзаков: Как 200 строк на Go помогли нам освободить 15 серверов»  Паша Мурзаков: Как 200 строк на Go помогли нам освободить 15 серверов»
Паша Мурзаков: Как 200 строк на Go помогли нам освободить 15 серверов»
Badoo Development
 
Как мы готовим MySQL
 Как мы готовим MySQL  Как мы готовим MySQL
Как мы готовим MySQL
Badoo Development
 
Архитектура хранения и отдачи фотографий в Badoo
Архитектура хранения и отдачи фотографий в Badoo Архитектура хранения и отдачи фотографий в Badoo
Архитектура хранения и отдачи фотографий в Badoo
Badoo Development
 
5 способов деплоя PHP-кода в условиях хайлоада
5 способов деплоя PHP-кода в условиях хайлоада5 способов деплоя PHP-кода в условиях хайлоада
5 способов деплоя PHP-кода в условиях хайлоада
Badoo Development
 
Git хуки на страже качества кода
Git хуки на страже качества кодаGit хуки на страже качества кода
Git хуки на страже качества кода
Badoo Development
 
Versioning strategy for a complex internal API
Versioning strategy for a complex internal APIVersioning strategy for a complex internal API
Versioning strategy for a complex internal API
Badoo Development
 
Как мы готовим MySQL
Как мы готовим MySQLКак мы готовим MySQL
Как мы готовим MySQL
Badoo Development
 
Методология: БЭМ, Модули, Отношения
Методология: БЭМ, Модули, ОтношенияМетодология: БЭМ, Модули, Отношения
Методология: БЭМ, Модули, Отношения
Badoo Development
 

Recently uploaded (20)

ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 

Profiling and optimizing go programs

  • 1. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 1/84 Pro ling and optimizing Go programs 14 July 2016 Marko Kevac Software Engineer, Badoo
  • 2. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 2/84 Introduction
  • 3. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 3/84 What is pro ling and optimization?
  • 4. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 4/84 Pro ling on Linux
  • 5. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 5/84 Pro ling on OSX OSX pro ling xed in El Capitan. Previous versions need binary patch. godoc.org/rsc.io/pprof_mac_ x(https://godoc.org/rsc.io/pprof_mac_ x)
  • 6. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 6/84 CPU github.com/gperftools/gperftools(https://github.com/gperftools/gperftools)
  • 7. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 7/84 CPU pprof is a sampling pro ler. All pro lers in Go can be started in a di erent ways, but all of them can be broken into collection and visualization phase. Example.
  • 8. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 8/84 Example packageperftest import( "regexp" "strings" "testing" ) varhaystack=`Loremipsumdolorsitamet...auctor...elit...` funcBenchmarkSubstring(b*testing.B){ fori:=0;i<b.N;i++{ strings.Contains(haystack,"auctor") } } funcBenchmarkRegex(b*testing.B){ fori:=0;i<b.N;i++{ regexp.MatchString("auctor",haystack) } }
  • 9. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 9/84 Benchmark $gotest-bench=. testing:warning:noteststorun BenchmarkSubstring-8 10000000 194ns/op BenchmarkRegex-8 200000 7516ns/op PASS ok github.com/mkevac/perftest00 3.789s
  • 10. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 10/84 Pro ling $GOGC=offgotest-bench=BenchmarkRegex-cpuprofilecpu.out testing:warning:noteststorun BenchmarkRegex-8 200000 6773ns/op PASS ok github.com/mkevac/perftest00 1.491s GOGC=o turns o garbage collector Turning o GC can be bene cial for short programs. When started with -cpupro le, go test puts binary in our working dir.
  • 11. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 11/84 Visualization Linux $gotoolpprofperftest00.testcpu.out (pprof)web OSX $openhttps://www.xquartz.org $ssh-Yserver $gotoolpprofperftest00.testcpu.out (pprof)web Other $gotoolpprof-svg./perftest00.test./cpu.out>cpu.svg $scp... $opencpu.svg
  • 12. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 12/84 Visualization
  • 13. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 13/84
  • 14. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 14/84 Visualization
  • 15. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 15/84 Fix packageperftest import( "regexp" "strings" "testing" ) varhaystack=`Loremipsumdolorsitamet...auctor...elit...` varpattern=regexp.MustCompile("auctor") funcBenchmarkSubstring(b*testing.B){ fori:=0;i<b.N;i++{ strings.Contains(haystack,"auctor") } } funcBenchmarkRegex(b*testing.B){ fori:=0;i<b.N;i++{ pattern.MatchString(haystack) } }
  • 16. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 16/84 Benchmark $gotest-bench=. testing:warning:noteststorun BenchmarkSubstring-8 10000000 170ns/op BenchmarkRegex-8 5000000 297ns/op PASS ok github.com/mkevac/perftest01 3.685s What about call graph?
  • 17. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 17/84 Visualization We don't see compilation at all.
  • 18. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 18/84 Ways to start CPU pro ler 1. go test -cpupro le=cpu.out 2. pprof.StartCPUPro le() and pprof.StopCPUPro le() or Dave Cheney great package github.com/pkg/pro le(https://github.com/pkg/pro le) 3. import _ "net/http/pprof" Example
  • 19. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 19/84 Example packagemain import( "net/http" _"net/http/pprof" ) funccpuhogger(){ varaccuint64 for{ acc+=1 ifacc&1==0{ acc<<=1 } } } funcmain(){ gohttp.ListenAndServe("0.0.0.0:8080",nil) cpuhogger() }
  • 20. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 20/84 Visualization $gotoolpprofhttp://localhost:8080/debug/pprof/profile?seconds=5 (pprof)web (pprof)top 4.99sof4.99stotal( 100%) flat flat% sum% cum cum% 4.99s 100% 100% 4.99s 100% main.cpuhogger 0 0% 100% 4.99s 100% runtime.goexit 0 0% 100% 4.99s 100% runtime.main (pprof)listcpuhogger Total:4.99s Nosourceinformationformain.cpuhogger No disassembly? No source code? We need binary.
  • 21. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 21/84 Visualization $gotoolpprofpproftesthttp://localhost:8080/debug/pprof/profile?seconds=5 (pprof)listcpuhogger Total:4.97s ROUTINE========================main.cpuhoggerin/home/marko/goprojects/src/github.com/mkevac/pproft 4.97s 4.97s(flat,cum) 100%ofTotal . . 6:) . . 7: . . 8:funccpuhogger(){ . . 9: varaccuint64 . . 10: for{ 2.29s 2.29s 11: acc+=1 1.14s 1.14s 12: ifacc&1==0{ 1.54s 1.54s 13: acc<<=1 . . 14: } . . 15: } . . 16:} . . 17: . . 18:funcmain(){
  • 22. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 22/84 Visualization (pprof)disasmcpuhogger Total:4.97s ROUTINE========================main.cpuhogger 4.97s 4.97s(flat,cum) 100%ofTotal . . 401000:XORLAX,AX 1.75s 1.75s 401002:INCQAX 1.14s 1.14s 401005:TESTQ$0x1,AX . . 40100b:JNE0x401002 1.54s 1.54s 40100d:SHLQ$0x1,AX 540ms 540ms 401010:JMP0x401002 . . 401012:INT$0x3 Why? Let's dig deeper.
  • 23. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 23/84 Why? $curlhttp://localhost:8080/debug/pprof/profile?seconds=5-o/tmp/cpu.log $strings/tmp/cpu.log|grepcpuhogger /debug/pprof/symbol for acquiring symbols binary for disassembly binary and source code for source code Currently there is no way to specify path to source code (same as "dir" command in gdb) :-( Binary that you give to pprof and binary that is running must be the same! Not deep enough?
  • 24. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 24/84 How pprof works? 1. Current desktop and server OS's implement preemptive scheduling (https://en.wikipedia.org/wiki/Preemption_(computing))or preemptive multitasking (oposing to cooperative multitasking). 2. Hardware sends signal to OS and OS executes scheduler which can preempt working process and put other process on it's place. 3. pprof works in similar fashion. 4. man setitimer(http://man7.org/linux/man-pages/man2/setitimer.2.html)and SIGPROF 5. Go sets handler for SIGPROF which gets and saves stack traces for all goroutines/threads. 6. Separate goroutine gives this data to user. Bug in SIGPROF signal delivery(http://research.swtch.com/macpprof)was the reason why pro ling on OSX pre El Capitain did not work.
  • 25. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 25/84 How pprof works? Cons 1. Signals are not cheap. Do not expect more than 500 signals per second. Default frequency in Go runtime is 100 HZ. 2. In non standard builds (-buildmode=c-archive or -buildmode=c-shared) pro ler do not work by default. 3. User space process do not have access to kernel stack trace. Pros Go runtime has all the knowledge about internal stu .
  • 26. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 26/84 Linux system pro lers varhaystack=`Loremipsumdolorsitamet...auctor...elit...` funcUsingSubstring()bool{ found:=strings.Contains(haystack,"auctor") returnfound } funcUsingRegex()bool{ found,_:=regexp.MatchString("auctor",haystack) returnfound } funcmain(){ gofunc(){ for{ UsingSubstring() } }() for{ UsingRegex() } }
  • 27. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 27/84 Systemtap Systemtap script -> C code -> Kernel module stap utility do all these things for you. Including kernel module loading and unloading.
  • 28. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 28/84 Systemtap Getting probe list: $stap-l'process("systemtap").function("main.*")' process("systemtap").function("[email protected]:16") process("systemtap").function("[email protected]:11") process("systemtap").function("[email protected]:32") process("systemtap").function("[email protected]:22") process("systemtap").function("[email protected]:21") Getting probe list with function arguments $stap-L'process("systemtap").function("runtime.mallocgc")' process("systemtap").function("runtime.mallocgc@src/runtime/malloc.go:553") $shouldhelpgc:bool$noscan:bool$scanSize:uintptr$dataSize:uintptr$x:void*$s:structruntime.mspan* runtime.g*$size:uintptr$typ:runtime._type*$needzero:bool$~r3:void* Systemtap do not understand where Go keeps return value, so we can get in manually: printf("%dn",user_int64(register("rsp")+8))
  • 29. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 29/84 Systemtap globaletime globalintervals probe$1.call { etime=gettimeofday_ns() } probe$1.return{ intervals<<<(gettimeofday_ns()-etime)/1000 } probeend{ printf("Durationmin:%dusavg:%dusmax:%duscount:%dn", @min(intervals),@avg(intervals),@max(intervals), @count(intervals)) printf("Duration(us):n") print(@hist_log(intervals)); printf("n") }
  • 30. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 30/84 Systemtap $sudostapmain.stap'process("systemtap").function("main.UsingSubstring")' ^CDurationmin:0usavg:1usmax:586uscount:1628362 Duration(us): value|--------------------------------------------------count 0| 10 1|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1443040 2|@@@@@ 173089 4| 6982 8| 4321 16| 631 32| 197 64| 74 128| 13 256| 4 512| 1 1024| 0 2048| 0
  • 31. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 31/84 Systemtap $./systemtap runtime:unexpectedreturnpcformain.UsingSubstringcalledfrom0x7fffffffe000 fatalerror:unknowncallerpc runtimestack: runtime.throw(0x494e40,0x11) /home/marko/go/src/runtime/panic.go:566+0x8b runtime.gentraceback(0xffffffffffffffff,0xc8200337a8,0x0,0xc820001d40,0x0,0x0,0x7fffffff,0x7fff /home/marko/go/src/runtime/traceback.go:311+0x138c runtime.scanstack(0xc820001d40) /home/marko/go/src/runtime/mgcmark.go:755+0x249 runtime.scang(0xc820001d40) /home/marko/go/src/runtime/proc.go:836+0x132 runtime.markroot.func1() /home/marko/go/src/runtime/mgcmark.go:234+0x55 runtime.systemstack(0x4e4f00) /home/marko/go/src/runtime/asm_amd64.s:298+0x79 runtime.mstart() /home/marko/go/src/runtime/proc.go:1087
  • 32. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 32/84 Systemtap Crash when Go's garbage collector gets its call trace. Probably caused by trampoline that systemtap puts in our code to handle its probes. goo.gl/N8XH3p(https://goo.gl/N8XH3p) No x yet. But Go is not alone. There are problems with uretprobes trampoline in C++ too (https://sourceware.org/bugzilla/show_bug.cgi?id=12275)(2010-)
  • 33. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 33/84 Systemtap packagemain import( "bytes" "fmt" "math/rand" "time" ) funcToString(numberint)string{ returnfmt.Sprintf("%d",number) } funcmain(){ r:=rand.New(rand.NewSource(time.Now().UnixNano())) varbufbytes.Buffer fori:=0;i<1000;i++{ value:=r.Int()%1000 value=value-500 buf.WriteString(ToString(value)) } }
  • 34. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 34/84 Systemtap globalintervals probeprocess("systemtap02").function("main.ToString").call { intervals<<<$number } probeend{ printf("Variablesmin:%dusavg:%dusmax:%duscount:%dn", @min(intervals),@avg(intervals),@max(intervals), @count(intervals)) printf("Variables:n") print(@hist_log(intervals)); printf("n") }
  • 35. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 35/84 Systemtap Variablesmin:-499usavg:8usmax:497uscount:1000 Variables: value|--------------------------------------------------count -256|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 249 -128|@@@@@@@@@@@@@@@@@@@@ 121 -64|@@@@@@@@@@ 60 -32|@@@@@@ 36 -16|@@ 12 -8|@ 8 -4| 5 -2| 3 -1| 2 0| 2 1| 2 2| 3 4|@ 7 8| 4 16|@@@ 20 32|@@@@@ 33 64|@@@@@@@ 44 128|@@@@@@@@@@@@@@@@@@ 110 256|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 279
  • 36. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 36/84 perf and perf_events $sudoperftop-p$(pidofsystemtap)
  • 37. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 37/84
  • 38. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 38/84 perf and perf_events
  • 39. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 39/84
  • 40. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 40/84 Brendan Gregg Flame Graphs www.brendangregg.com/ amegraphs.html(http://www.brendangregg.com/ amegraphs.html) Systems Performance: Enterprise and the Cloud goo.gl/556Hs2(http://goo.gl/556Hs2) $sudoperfrecord-F99-g-p$(pidofsystemtap)--sleep10 [perfrecord:Wokenup1timestowritedata] [perfrecord:Capturedandwrote0.149MBperf.data(1719samples)] $sudoperfscript|~/tmp/FlameGraph/stackcollapse-perf.pl>out.perf-folded $~/tmp/FlameGraph/flamegraph.plout.perf-folded>perf-kernel.svg
  • 41. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 41/84 Brendan Gregg Flame Graphs Kernel stack traces!
  • 42. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 42/84 Memory What if we were in C/C++ world? Valgrind! Massif! #include<stdlib.h> #include<unistd.h> #include<string.h> intmain(){ constsize_tMB=1024*1024; constunsignedcount=20; char**buf=calloc(count,sizeof(*buf)); for(unsignedi=0;i<count;i++){ buf[i]=calloc(1,MB); memset(buf[i],0xFF,MB); sleep(1); } for(unsignedi=0;i<count;i++){ free(buf[i]); sleep(1); } free(buf); }
  • 43. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 43/84 Vagrind and Massif 26.20^ :: | :::# | @@::#:: | ::@::#::: | :::::@::#:::::: | ::::@::#:::::: | ::::::@::#:::::: | :::::::::@::#:::::::::: | :::::::::@::#::::::::@@ | ::::::::::@::#::::::::@:: | ::@:::::::::@::#::::::::@:::: | :::@:::::::::@::#::::::::@::::: | ::::@:::::::::@::#::::::::@:::::: | :::::@:::::::::@::#::::::::@::::::: |::::::::@:::::::::@::#::::::::@:::::::::: |:::::::@:::::::::@::#::::::::@::::::::: |@::::::@:::::::::@::#::::::::@:::::::::@ |@::::::@:::::::::@::#::::::::@:::::::::@ |@::::::@:::::::::@::#::::::::@:::::::::@ |@::::::@:::::::::@::#::::::::@:::::::::@ 0+----------------------------------------------------------------------->s 0 39.13
  • 44. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 44/84 Valgrind and Massif Valgrind rede nes all memory allocation functions (malloc, calloc, new, free, etc.). Go do not use them. Go has their own memory allocator which uses mmap or sbrk.
  • 45. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 45/84 Memory Valgrind can catch mmap/sbrk, but there is no point. All other memory pro ling tools work in the same fashion. We can theoretically use perf/systemtap Or we can use rich internal tools
  • 46. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 46/84 Memory Go can collect information about allocations with some rate (once in 512KiB by default). pprof can visualize it. Similar to CPU pro ling, we have three ways to collect data. Let's use net/http/pprof this time.
  • 47. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 47/84 Example import_"net/http/pprof" funcallocAndKeep(){ varb[][]byte for{ b=append(b,make([]byte,1024)) time.Sleep(time.Millisecond) } } funcallocAndLeave(){ varb[][]byte for{ b=append(b,make([]byte,1024)) iflen(b)==20{ b=nil } time.Sleep(time.Millisecond) } } funcmain(){ goallocAndKeep() goallocAndLeave() http.ListenAndServe("0.0.0.0:8080",nil) }
  • 48. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 48/84 go tool pprof alloc_space - allocated bytes alloc_objects - number of allocated objects inuse_space - allocated bytes that are in use (live) inuse_objects - number of allocated objects that are in use (live) We expect inuse to show only allocAndKeep() and alloc to show both functions.
  • 49. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 49/84 go tool pprof $gotoolpprof-inuse_spacememtesthttp://localhost:8080/debug/pprof/heap Fetchingprofilefromhttp://localhost:8080/debug/pprof/heap Savedprofilein/home/marko/pprof/pprof.memtest.localhost:8080.inuse_objects.inuse_space.005.pb.gz Enteringinteractivemode(type"help"forcommands) (pprof)top 15.36MBof15.36MBtotal( 100%) Dropped2nodes(cum<=0.08MB) flat flat% sum% cum cum% 15.36MB 100% 100% 15.36MB 100% main.allocAndKeep 0 0% 100% 15.36MB 100% runtime.goexit $gotoolpprof-alloc_spacememtesthttp://localhost:8080/debug/pprof/heap Fetchingprofilefromhttp://localhost:8080/debug/pprof/heap Savedprofilein/home/marko/pprof/pprof.memtest.localhost:8080.alloc_objects.alloc_space.008.pb.gz Enteringinteractivemode(type"help"forcommands) (pprof)top 54.49MBof54.49MBtotal( 100%) Dropped8nodes(cum<=0.27MB) flat flat% sum% cum cum% 27.97MB51.33%51.33% 29.47MB54.08% main.allocAndKeep 23.52MB43.17%94.49% 25.02MB45.92% main.allocAndLeave 3MB 5.51% 100% 3MB 5.51% time.Sleep 0 0% 100% 54.49MB 100% runtime.goexit
  • 50. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 50/84 Sleep? Looks like predicted. But what is with sleep? (pprof)listtime.Sleep Total:54.49MB ROUTINE========================time.Sleepin/home/marko/go/src/runtime/time.go 3MB 3MB(flat,cum) 5.51%ofTotal . . 48:functimeSleep(nsint64){ . . 49: ifns<=0{ . . 50: return . . 51: } . . 52: 3MB 3MB 53: t:=new(timer) . . 54: t.when=nanotime()+ns . . 55: t.f=goroutineReady . . 56: t.arg=getg() . . 57: lock(&timers.lock) . . 58: addtimerLocked(t)
  • 51. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 51/84 Implicit allocations packageprinttest import( "bytes" "fmt" "testing" ) funcBenchmarkPrint(b*testing.B){ varbufbytes.Buffer varsstring="teststring" fori:=0;i<b.N;i++{ buf.Reset() fmt.Fprintf(&buf,"stringis:%s",s) } } Benchmark?
  • 52. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 52/84 Benchmark $gotest-bench=.-benchmem testing:warning:noteststorun BenchmarkPrint-8 10000000 128ns/op 16B/op 1allocs/op PASS ok github.com/mkevac/converttest 1.420s
  • 53. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 53/84 Pro ling $gotest-bench=.-memprofile=mem.out-memprofilerate=1 mempro lerate sets pro ling rate. 1 means all allocations. $ go tool pprof -alloc_space converttest.test mem.out (pprof)top 15.41MBof15.48MBtotal(99.59%) Dropped73nodes(cum<=0.08MB) flat flat% sum% cum cum% 15.41MB99.59%99.59% 15.43MB99.67% github.com/mkevac/converttest.BenchmarkPrint 0 0%99.59% 15.47MB99.93% runtime.goexit 0 0%99.59% 15.42MB99.66% testing.(*B).launch 0 0%99.59% 15.43MB99.67% testing.(*B).runN
  • 54. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 54/84 Pro ling (pprof)listBenchmarkPrint Total:15.48MB ROUTINE========================github.com/mkevac/converttest.BenchmarkPrintin/home/marko/goproject 15.41MB 15.43MB(flat,cum)99.67%ofTotal . . 9:funcBenchmarkPrint(b*testing.B){ . . 10: varbufbytes.Buffer . . 11: varsstring="teststring" . . 12: fori:=0;i<b.N;i++{ . . 13: buf.Reset() 15.41MB 15.43MB 14: fmt.Fprintf(&buf,"stringis:%s",s) . . 15: } . . 16:}
  • 55. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 55/84 Pro ling (pprof)listfmt.Fprintf Total:15.48MB ROUTINE========================fmt.Fprintfin/home/marko/go/src/fmt/print.go 0 12.02kB(flat,cum)0.076%ofTotal . . 175://Theseroutinesendin'f'andtakeaformatstring. . . 176: . . 177://Fprintfformatsaccordingtoaformatspecifierandwritestow. . . 178://Itreturnsthenumberofbyteswrittenandanywriteerrorencountered. . . 179:funcFprintf(wio.Writer,formatstring,a...interface{})(nint,errerror) . 11.55kB 180: p:=newPrinter() . 480B 181: p.doPrintf(format,a) . . 182: n,err=w.Write(p.buf) . . 183: p.free() . . 184: return . . 185:} . . 186:
  • 56. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 56/84 Disassembly . . 466edb:CALLbytes.(*Buffer).Reset(SB) . . 466ee0:LEAQ0x98b6b(IP),AX . . 466ee7:MOVQAX,0x70(SP) . . 466eec:MOVQ$0xb,0x78(SP) . . 466ef5:MOVQ$0x0,0x60(SP) . . 466efe:MOVQ$0x0,0x68(SP) . . 466f07:LEAQ0x70d92(IP),AX . . 466f0e:MOVQAX,0(SP) . . 466f12:LEAQ0x70(SP),AX . . 466f17:MOVQAX,0x8(SP) . . 466f1c:MOVQ$0x0,0x10(SP) 15.41MB 15.41MB 466f25:CALLruntime.convT2E(SB) . . 466f2a:MOVQ0x18(SP),AX . . 466f2f:MOVQ0x20(SP),CX . . 466f34:MOVQAX,0x60(SP) . . 466f39:MOVQCX,0x68(SP) . . 466f3e:LEAQ0x10b35b(IP),AX . . 466f45:MOVQAX,0(SP) . . 466f49:MOVQ0x58(SP),AX . . 466f4e:MOVQAX,0x8(SP) . . 466f53:LEAQ0x99046(IP),CX . . 466f5a:MOVQCX,0x10(SP)
  • 57. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 57/84 . . 466f5f:MOVQ$0xd,0x18(SP) . . 466f68:LEAQ0x60(SP),CX . . 466f6d:MOVQCX,0x20(SP) . . 466f72:MOVQ$0x1,0x28(SP) . . 466f7b:MOVQ$0x1,0x30(SP) . 12.02kB 466f84:CALLfmt.Fprintf(SB)
  • 58. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 58/84 fprintf funcFprintf(wio.Writer,formatstring,a...interface{})(nint,errerror) interface{} same as void*... but it's not
  • 59. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 59/84 Go internal types string, chan, func, slice, interface, etc.
  • 60. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 60/84 Empty interface varsstring=“marko” varainterface{}=&s no allocation varsstring=“marko” varainterface{}=s 16 bytes allocation
  • 61. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 61/84 Empty interface
  • 62. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 62/84 Fix packagemain import( "bytes" "testing" ) funcBenchmarkPrint(b*testing.B){ varbufbytes.Buffer varsstring="teststring" fori:=0;i<b.N;i++{ buf.Reset() buf.WriteString("stringis:") buf.WriteString(s) } } Benchmark?
  • 63. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 63/84 Benchmark $gotest-bench=BenchmarkPrint-benchmem testing:warning:noteststorun BenchmarkPrint-8 50000000 27.5ns/op 0B/op 0allocs/op PASS ok github.com/mkevac/converttest01 1.413s 0 allocations and 4x speed
  • 64. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 64/84 Implicit allocation String and char * pretty much the same in C. But not in Go. packagemain import( "fmt" ) funcmain(){ vararray=[]byte{'m','a','r','k','o'} ifstring(array)=="marko"{ fmt.Println("equal") } }
  • 65. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 65/84 Implicit allocation Always check your assumptions. Go runtime, Go compiler and Go tools are better with each day. Some optimization you read about in 2010 could be not needed. Or can be harmful.
  • 66. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 66/84 Example (again) packagemain import( "bytes" "testing" "unsafe" ) varsstring funcBenchmarkConvert(b*testing.B){ varbufbytes.Buffer vararray=[]byte{'m','a','r','k','o',0} fori:=0;i<b.N;i++{ buf.Reset() s=string(array) buf.WriteString(s) } }
  • 67. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 67/84 Benchmark $gotest-bench=.-benchmem testing:warning:noteststorun BenchmarkConvert-8 30000000 42.1ns/op 8B/op 1allocs/op
  • 68. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 68/84 Fix funcBytesToString(b[]byte)string{ bh:=(*reflect.SliceHeader)(unsafe.Pointer(&b)) sh:=reflect.StringHeader{bh.Data,bh.Len} return*(*string)(unsafe.Pointer(&sh)) } funcBenchmarkNoConvert(b*testing.B){ varbufbytes.Buffer vararray=[]byte{'m','a','r','k','o',0} fori:=0;i<b.N;i++{ buf.Reset() s=BytesToString(array) buf.WriteString(s) } }
  • 69. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 69/84 Benchmark $gotest-bench=.-benchmem testing:warning:noteststorun BenchmarkConvert-8 30000000 44.5ns/op 8B/op 1allocs/op BenchmarkNoConvert-8 100000000 19.2ns/op 0B/op 0allocs/op PASS ok github.com/mkevac/bytetostring 3.332s
  • 70. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 70/84 Tracing Go runtime writes almost everything it does. Scheduling, channel operations, locks, thread creation, ... Full list in runtime/trace.go For visualization go tool trace uses same JS package that Chrome uses for page loading visualization. Example.
  • 71. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 71/84 debugcharts github.com/mkevac/debugcharts(http://github.com/mkevac/debugcharts) runtime.ReadMemStats() once a second
  • 72. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 72/84 Example import( "net/http" _"net/http/pprof" "time" _"github.com/mkevac/debugcharts" ) funcCPUHogger(){ varaccuint64 t:=time.Tick(2*time.Second) for{ select{ case<-t: time.Sleep(50*time.Millisecond) default: acc++ } } } funcmain(){ goCPUHogger() goCPUHogger() http.ListenAndServe("0.0.0.0:8181",nil) }
  • 73. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 73/84 Tracing $curlhttp://localhost:8181/debug/pprof/trace?seconds=10-otrace.out Sometimes all you can visualize is 1-3 seconds. $gotooltrace-http"0.0.0.0:8080"./tracetesttrace.out
  • 74. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 74/84 Tracing
  • 75. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 75/84 Tracing
  • 76. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 76/84 Tracing
  • 77. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 77/84 proc stop and proc start
  • 78. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 78/84 runtime.ReadMemStats() 180//ReadMemStatspopulatesmwithmemoryallocatorstatistics. 181funcReadMemStats(m*MemStats){ 182 stopTheWorld("readmemstats") 183 184 systemstack(func(){ 185 readmemstats_m(m) 186 }) 187 188 startTheWorld() 189} Production? No!
  • 79. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 79/84 Conclusion There are so much more
  • 80. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 80/84 Conlusion CPU pro ler Memory pro ler All allocations tracing Escape analysis Lock/Contention pro ler Scheduler tracing Tracing GC tracing Real time memory statistics System pro lers like perf and systemtap. But no tool will replace deep understanding of how your program works from start to nish.
  • 81. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 81/84 I hope that today's crash course was helpful.
  • 82. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 82/84 Stay curious
  • 83. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 83/84 Thank you Marko Kevac Software Engineer, Badoo [email protected](mailto:[email protected]) @mkevac(http://twitter.com/mkevac)
  • 84. 5/12/2016 Profiling and optimizing Go programs http://localhost:3999/gomeetup.slide#1 84/84