SlideShare a Scribd company logo
If you don’t get this ref...shame on you
Jarred Nicholls
  @jarrednicholls
jarred@webkit.org
Work @ Sencha
Web Platform Team

   Doing webkitty things...
WebKit Committer
Co-Author
W3C Web Cryptography
        API
JavaScript on the GPU
What I’ll blabber about today
Why JavaScript on the GPU
Running JavaScript on the GPU

What’s to come...
Why JavaScript on the GPU?
Why JavaScript on the GPU?

        Better question:
         Why a GPU?
Why JavaScript on the GPU?

        Better question:
         Why a GPU?

        A: They’re fast!
         (well, at certain things...)
GPUs are fast b/c...
Totally different paradigm from CPUs
Data parallelism vs. Task parallelism
Stream processing vs. Sequential processing
    GPUs can divide-and-conquer
Hardware capable of a large number of “threads”
    e.g. ATI Radeon HD 6770m:
    480 stream processing units == 480 cores
Typically very high memory bandwidth
Many, many GigaFLOPs
GPUs don’t solve all problems
Not all tasks can be accelerated by GPUs
Tasks must be parallelizable, i.e.:
    Side effect free
    Homogeneous and/or streamable
Overall tasks will become limited by Amdahl’s Law
JavaScript on the GPU
Let’s find out...
Experiment
Code Name “LateralJS”
LateralJS

Our Mission
To make JavaScript a first-class citizen on all GPUs
and take advantage of hardware accelerated
operations & data parallelization.
Our Options
          OpenCL                 Nvidia CUDA
AMD, Nvidia, Intel, etc.   Nvidia only
A shitty version of C99    C++ (C for CUDA)
No dynamic memory          Dynamic memory
No recursion               Recursion
No function pointers       Function pointers
Terrible tooling           Great dev. tooling
Immature (arguably)        More mature (arguably)
Our Options
          OpenCL                 Nvidia CUDA
AMD, Nvidia, Intel, etc.   Nvidia only
A shitty version of C99    C++ (C for CUDA)
No dynamic memory          Dynamic memory
No recursion               Recursion
No function pointers       Function pointers
Terrible tooling           Great dev. tooling
Immature (arguably)        More mature (arguably)
Why not a Static Compiler?
We want full JavaScript support
    Object / prototype
    Closures
    Recursion
    Functions as objects
    Variable typing
Type Inference limitations
Reasonably limited to size and complexity of “kernel-
esque” functions
Not nearly insane enough
JavaScript on the GPU
Why an Interpreter?
We want it all baby - full JavaScript support!
Most insane approach
Challenging to make it good, but holds a lot of promise
OpenCL Headaches
JavaScript on the GPU
Oh the agony...
Multiple memory spaces - pointer hell
No recursion - all inlined functions
No standard libc libraries
No dynamic memory
No standard data structures - apart from vector ops
Buggy ass AMD/Nvidia compilers
JavaScript on the GPU
Multiple Memory Spaces
In the order of fastest to slowest:
         space                 description
                   very fast
         private   stream processor cache (~64KB)
                   scoped to a single work item
                   fast
          local    ~= L1 cache on CPUs (~64KB)
                   scoped to a single work group
                   slow, by orders of magnitude
         global    ~= system memory over slow bus
        constant   available to all work groups/items
                   all the VRAM on the card (MBs)
Memory Space Pointer Hell
global uchar* gptr = 0x1000;
local uchar* lptr = (local uchar*) gptr; // FAIL!
uchar* pptr = (uchar*) gptr; // FAIL! private is implicit


                               0x1000




             global             local            private




            0x1000 points to something different
              depending on the address space!
Memory Space Pointer Hell
           Pointers must always be fully qualified
                Macros to help ease the pain

#define   GPTR(TYPE)   global TYPE*
#define   CPTR(TYPE)   constant TYPE*
#define   LPTR(TYPE)   local TYPE*
#define   PPTR(TYPE)   private TYPE*
No Recursion!?!?!?
  No call stack
  All functions are inlined to the kernel function


uint factorial(uint n) {
    if (n <= 1)
         return 1;
    else
         return n * factorial(n - 1); // compile-time error
}
No standard libc libraries
memcpy?
strcpy?
strcmp?
etc...
No standard libc libraries
                              Implement our own
#define MEMCPY(NAME, DEST_AS, SRC_AS) 
    DEST_AS void* NAME(DEST_AS void*, SRC_AS const void*, uint); 
    DEST_AS void* NAME(DEST_AS void* dest, SRC_AS const void* src, uint size) { 
        DEST_AS uchar* cDest = (DEST_AS uchar*)dest; 
        SRC_AS const uchar* cSrc = (SRC_AS const uchar*)src; 
        for (uint i = 0; i < size; i++) 
            cDest[i] = cSrc[i]; 
        return (DEST_AS void*)cDest; 
    }
PTR_MACRO_DEST_SRC(MEMCPY, memcpy)


                                        Produces
             memcpy_g            memcpy_gc           memcpy_lc           memcpy_pc
             memcpy_l            memcpy_gl           memcpy_lg           memcpy_pg
             memcpy_p            memcpy_gp           memcpy_lp           memcpy_pl
No dynamic memory
No malloc()
No free()
What to do...
Yes! dynamic memory
  Create a large buffer of global memory - our “heap”
  Implement our own malloc() and free()
  Create a handle structure - “virtual memory”
  P(T, hnd) macro to get the current pointer address

GPTR(handle) hnd = malloc(sizeof(uint));
GPTR(uint) ptr = P(uint, hnd);
*ptr = 0xdeadbeef;
free(hnd);
JavaScript on the GPU
Ok, we get the point...
        FYL!
High-level Architecture
       V8                 Data Heap



Esprima Parser            Stack-based
                          Interpreter

                          Host
                          Host
     Host                  GPUs
Data Serializer &
  Marshaller           Garbage Collector



  Device Mgr
High-level Architecture
                    eval(code);
       V8                               Data Heap
                    Build JSON AST

Esprima Parser                          Stack-based
                                        Interpreter

                                        Host
                                        Host
     Host                                GPUs
Data Serializer &
  Marshaller                         Garbage Collector



  Device Mgr
High-level Architecture
                    eval(code);
       V8                                  Data Heap
                    Build JSON AST

Esprima Parser                             Stack-based
                                           Interpreter
                       Serialize AST
                                           Host
                                           Host
     Host           JSON => C Structs       GPUs
Data Serializer &
  Marshaller                            Garbage Collector



  Device Mgr
High-level Architecture
                         eval(code);
       V8                                         Data Heap
                          Build JSON AST

Esprima Parser                                    Stack-based
                                                  Interpreter
                            Serialize AST
                                                  Host
                                                  Host
     Host                JSON => C Structs         GPUs
Data Serializer &
  Marshaller                                   Garbage Collector
                    Ship to GPU to Interpret

  Device Mgr
High-level Architecture
                         eval(code);
       V8                                         Data Heap
                          Build JSON AST

Esprima Parser                                    Stack-based
                                                  Interpreter
                            Serialize AST
                                                  Host
                                                  Host
     Host                JSON => C Structs         GPUs
Data Serializer &
  Marshaller                                   Garbage Collector
                    Ship to GPU to Interpret

  Device Mgr
                          Fetch Result
AST Generation
AST Generation

                                     JSON AST
JavaScript Source
                                    (v8::Object)




                                                   Lateral AST
                    Esprima in V8
                                                   (C structs)
Embed esprima.js

              Resource Generator

$ resgen esprima.js resgen_esprima_js.c
Embed esprima.js

                       resgen_esprima_js.c
const unsigned char resgen_esprima_js[]   = {
    0x2f, 0x2a, 0x0a, 0x20, 0x20, 0x43,   0x6f, 0x70, 0x79, 0x72,
    0x69, 0x67, 0x68, 0x74, 0x20, 0x28,   0x43, 0x29, 0x20, 0x32,
    ...
    0x20, 0x3a, 0x20, 0x2a, 0x2f, 0x0a,   0x0a, 0
};
Embed esprima.js
                          ASTGenerator.cpp
extern const char resgen_esprima_js;

void ASTGenerator::init()
{
    HandleScope scope;
    s_context = Context::New();
    s_context->Enter();
    Handle<Script> script = Script::Compile(String::New(&resgen_esprima_js));
    script->Run();
    s_context->Exit();
    s_initialized = true;
}
Build JSON AST

                    e.g.
ASTGenerator::esprimaParse(
    "var xyz = new Array(10);"
);
Build JSON AST
Handle<Object> ASTGenerator::esprimaParse(const char* javascript)
{
    if (!s_initialized)
        init();


    HandleScope scope;
    s_context->Enter();
    Handle<Object> global = s_context->Global();
    Handle<Object> esprima = Handle<Object>::Cast(global->Get(String::New("esprima")));
    Handle<Function> esprimaParse = Handle<Function>::Cast(esprima-
>Get(String::New("parse")));
    Handle<String> code = String::New(javascript);
    Handle<Object> ast = Handle<Object>::Cast(esprimaParse->Call(esprima, 1,
(Handle<Value>*)&code));


    s_context->Exit();
    return scope.Close(ast);
}
Build JSON AST
{
    "type": "VariableDeclaration",
    "declarations": [
        {
            "type": "VariableDeclarator",
            "id": {
                "type": "Identifier",
                "name": "xyz"
            },
            "init": {
                "type": "NewExpression",
                "callee": {
                    "type": "Identifier",
                    "name": "Array"
                },
                "arguments": [
                    {
                        "type": "Literal",
                        "value": 10
                    }
                ]
            }
        }
    ],
    "kind": "var"
}
Lateral AST structs
typedef struct ast_type_st {         #ifdef __OPENCL_VERSION__
    CL(uint) id;                     #define CL(TYPE) TYPE
    CL(uint) size;                   #else
} ast_type;                          #define CL(TYPE) cl_##TYPE
                                     #endif
typedef struct ast_program_st {
    ast_type type;
    CL(uint) body;
    CL(uint) numBody;
                                      Structs shared between
} ast_program;                           Host and OpenCL

typedef struct ast_identifier_st {
    ast_type type;
    CL(uint) name;
} ast_identifier;
Lateral AST structs

                            v8::Object => ast_type
                                  expanded
ast_type* vd1_1_init_id = (ast_type*)astCreateIdentifier("Array");
ast_type* vd1_1_init_args[1];
vd1_1_init_args[0] = (ast_type*)astCreateNumberLiteral(10);
ast_type* vd1_1_init = (ast_type*)astCreateNewExpression(vd1_1_init_id, vd1_1_init_args, 1);
free(vd1_1_init_id);
for (int i = 0; i < 1; i++)
    free(vd1_1_init_args[i]);
ast_type* vd1_1_id = (ast_type*)astCreateIdentifier("xyz");
ast_type* vd1_decls[1];
vd1_decls[0] = (ast_type*)astCreateVariableDeclarator(vd1_1_id, vd1_1_init);
free(vd1_1_id);
free(vd1_1_init);
ast_type* vd1 = (ast_type*)astCreateVariableDeclaration(vd1_decls, 1, "var");
for (int i = 0; i < 1; i++)
    free(vd1_decls[i]);
Lateral AST structs
                          astCreateIdentifier
ast_identifier* astCreateIdentifier(const char* str) {
    CL(uint) size = sizeof(ast_identifier) + rnd(strlen(str) + 1, 4);
    ast_identifier* ast_id = (ast_identifier*)malloc(size);

    // copy the string
    strcpy((char*)(ast_id + 1), str);

    // fill the struct
    ast_id->type.id = AST_IDENTIFIER;
    ast_id->type.size = size;
    ast_id->name = sizeof(ast_identifier); // offset

    return ast_id;
}
Lateral AST structs
         astCreateIdentifier(“xyz”)
offset      field              value
  0        type.id    AST_IDENTIFIER (0x01)
  4       type.size             16
  8        name             12 (offset)
 12        str[0]               ‘x’
 13        str[1]               ‘y’
 14        str[2]               ‘z’
 15        str[3]              ‘0’
Lateral AST structs
                                  astCreateNewExpression
ast_expression_new* astCreateNewExpression(ast_type* callee, ast_type** arguments, int numArgs) {
    CL(uint) size = sizeof(ast_expression_new) + callee->size;
    for (int i = 0; i < numArgs; i++)
        size += arguments[i]->size;

    ast_expression_new* ast_new = (ast_expression_new*)malloc(size);
    ast_new->type.id = AST_NEW_EXPR;
    ast_new->type.size = size;

    CL(uint) offset = sizeof(ast_expression_new);
    char* dest = (char*)ast_new;

    // copy callee
    memcpy(dest + offset, callee, callee->size);
    ast_new->callee = offset;
    offset += callee->size;

    // copy arguments
    if (numArgs) {
        ast_new->arguments = offset;
        for (int i = 0; i < numArgs; i++) {
            ast_type* arg = arguments[i];
            memcpy(dest + offset, arg, arg->size);
            offset += arg->size;
        }
    } else
        ast_new->arguments = 0;
    ast_new->numArguments = numArgs;

    return ast_new;
}
Lateral AST structs
                 new Array(10)
offset       field                 value
  0         type.id     AST_NEW_EXPR (0x308)
  4        type.size               52
  8         callee             20 (offset)
 12       arguments            40 (offset)
 16      numArguments              1
 20       callee node    ast_identifier (“Array”)
          arguments
 40                      ast_literal_number (10)
             node
Lateral AST structs
Shared across the Host and the OpenCL runtime
    Host writes, Lateral reads
Constructed on Host as contiguous blobs
    Easy to send to GPU: memcpy(gpu, ast, ast->size);
    Fast to send to GPU, single buffer write
    Simple to traverse w/ pointer arithmetic
Stack-based
 Interpreter
Building Blocks
                     JS Type Structs


AST Traverse Stack                       Lateral State


 Call/Exec Stack        Heap           Symbol/Ref Table


  Return Stack                           Scope Stack




AST Traverse Loop                      Interpret Loop
Kernels
#include "state.h"
#include "jsvm/asttraverse.h"
#include "jsvm/interpreter.h"

// Setup VM structures
kernel void lateral_init(GPTR(uchar) lateral_heap) {
    LATERAL_STATE_INIT
}

// Interpret the AST
kernel void lateral(GPTR(uchar) lateral_heap, GPTR(ast_type) lateral_ast) {
    LATERAL_STATE

    ast_push(lateral_ast);
    while (!Q_EMPTY(lateral_state->ast_stack, ast_q) || !Q_EMPTY(lateral_state->call_stack,
call_q)) {
        while (!Q_EMPTY(lateral_state->ast_stack, ast_q))
            traverse();
        if (!Q_EMPTY(lateral_state->call_stack, call_q))
            interpret();
    }
}
Let’s interpret...



 var x = 1 + 2;
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST   Call   Return
    "declarations": [
        {
            "type": "VariableDeclarator",
            "id": {
                "type": "Identifier",
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",             AST      Call   Return
    "declarations": [
        {
            "type": "VariableDeclarator",     VarDecl
            "id": {
                "type": "Identifier",
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                     "type": "Literal",
                     "value": 1
                },
                "right": {
                     "type": "Literal",
                     "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",             AST      Call   Return
    "declarations": [
        {
            "type": "VariableDeclarator",     VarDtor
            "id": {
                "type": "Identifier",
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST       Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",     Ident    VarDtor
            "id": {
                "type": "Identifier",          Binary
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",             AST       Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",      Ident    VarDtor
            "id": {
                "type": "Identifier",          Literal    Binary
            },
                "name": "x"
                                              Literal
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",             AST        Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",      Ident    VarDtor
            "id": {
                "type": "Identifier",          Literal    Binary
            },
                "name": "x"
                                                         Literal
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST       Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",     Ident   VarDtor
            "id": {
                "type": "Identifier",                   Binary
            },
                "name": "x"
                                                       Literal
            "init": {
                "type": "BinaryExpression",
                                                       Literal
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST     Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor
            "id": {
                "type": "Identifier",                 Binary
            },
                "name": "x"
                                                     Literal
            "init": {
                "type": "BinaryExpression",
                                                     Literal
                "operator": "+",
                "left": {
                                                      Ident
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST     Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor     “x”
            "id": {
                "type": "Identifier",                 Binary
            },
                "name": "x"
                                                     Literal
            "init": {
                "type": "BinaryExpression",
                                                     Literal
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST     Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor     “x”
            "id": {
                "type": "Identifier",                 Binary      1
            },
                "name": "x"
                                                     Literal
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST    Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor    “x”
            "id": {
                "type": "Identifier",                 Binary     1
            },
                "name": "x"
                                                                2
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST    Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor    “x”
            "id": {
                "type": "Identifier",                            3
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST   Call   Return
    "declarations": [
        {
            "type": "VariableDeclarator",
            "id": {
                "type": "Identifier",
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
Benchmark
Benchmark

                 Small loop of FLOPs
var input = new Array(10);
for (var i = 0; i < input.length; i++) {
    input[i] = Math.pow((i + 1) / 1.23, 3);
}
Execution Time
               Lateral
   GPU CL                CPU CL                      V8
 ATI Radeon 6770m   Intel Core i7 4x2.4Ghz   Intel Core i7 4x2.4Ghz




116.571533ms        0.226007ms               0.090664ms
Execution Time
               Lateral
   GPU CL                CPU CL                      V8
 ATI Radeon 6770m   Intel Core i7 4x2.4Ghz   Intel Core i7 4x2.4Ghz




116.571533ms        0.226007ms               0.090664ms
JavaScript on the GPU
What went wrong?
Everything
Stack-based AST Interpreter, no optimizations
Heavy global memory access, no optimizations
No data or task parallelism
Stack-based Interpreter
Slow as molasses
Memory hog Eclipse style
Heavy memory access
     “var x = 1 + 2;” == 30 stack hits alone!
     Too much dynamic allocation
No inline optimizations, just following the yellow brick AST
Straight up lazy

Replace with something better!
Bytecode compiler on Host
Bytecode register-based interpreter on Device
JavaScript on the GPU
Too much global access
   Everything is dynamically allocated to global memory
   Register based interpreter & bytecode compiler can
   make better use of local and private memory
// 11.1207 seconds
size_t tid = get_global_id(0);
c[tid] = a[tid];
while(b[tid] > 0) { // touch global memory on each loop
  b[tid]--; // touch global memory on each loop
  c[tid]++; // touch global memory on each loop       Optimizing memory access
}

// 0.0445558 seconds!! HOLY SHIT!
                                                      yields crazy results
size_t tid = get_global_id(0);
int tmp = a[tid]; // temp private variable
for(int i=b[tid]; i > 0; i--) tmp++; // touch private variables on each loop
c[tid] = tmp; // touch global memory one time
No data or task parallelism
  Everything being interpreted in a single “thread”
  We have hundreds of cores available to us!
  Build in heuristics
         Identify side-effect free statements
         Break into parallel tasks - very magical

                                                    input[0] = Math.pow((0 + 1) / 1.23, 3);
var input = new Array(10);
for (var i = 0; i < input.length; i++) {            input[1] = Math.pow((1 + 1) / 1.23, 3);

}
    input[i] = Math.pow((i + 1) / 1.23, 3);
                                                                        ...
                                                    input[9] = Math.pow((9 + 1) / 1.23, 3);
What’s in store
Acceptable performance on all CL devices
V8/Node extension to launch Lateral tasks
High-level API to perform map-reduce, etc.
Lateral-cluster...mmmmm
Thanks!

  Jarred Nicholls
  @jarrednicholls
jarred@webkit.org
Ad

More Related Content

What's hot (20)

DevOpsDays Taipei 2024 - Evolution of DevOps: Lessons Learned from a Growing ...
DevOpsDays Taipei 2024 - Evolution of DevOps: Lessons Learned from a Growing ...DevOpsDays Taipei 2024 - Evolution of DevOps: Lessons Learned from a Growing ...
DevOpsDays Taipei 2024 - Evolution of DevOps: Lessons Learned from a Growing ...
smalltown
 
用十分鐘搞懂《離散數學》
用十分鐘搞懂《離散數學》用十分鐘搞懂《離散數學》
用十分鐘搞懂《離散數學》
鍾誠 陳鍾誠
 
технологія пошуку інформації засобами мережі інтернет
технологія пошуку інформації засобами мережі інтернеттехнологія пошуку інформації засобами мережі інтернет
технологія пошуку інформації засобами мережі інтернет
Марина Белоцерковец
 
Презентація:Практична робота №2. Історія засобів опрацювання інформаційних об...
Презентація:Практична робота №2. Історія засобів опрацювання інформаційних об...Презентація:Практична робота №2. Історія засобів опрацювання інформаційних об...
Презентація:Практична робота №2. Історія засобів опрацювання інформаційних об...
sveta7940
 
Docker Introduction
Docker IntroductionDocker Introduction
Docker Introduction
Robert Reiz
 
8 Алгоритми з розгалуженнями для опрацювання величин
8 Алгоритми з розгалуженнями для опрацювання величин8 Алгоритми з розгалуженнями для опрацювання величин
8 Алгоритми з розгалуженнями для опрацювання величин
Сергій Каляфіцький
 
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword InformationEnriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
harmonylab
 
Docker Explained | What Is A Docker Container? | Docker Simplified | Docker T...
Docker Explained | What Is A Docker Container? | Docker Simplified | Docker T...Docker Explained | What Is A Docker Container? | Docker Simplified | Docker T...
Docker Explained | What Is A Docker Container? | Docker Simplified | Docker T...
Edureka!
 
Testing Ansible with Jenkins and Docker
Testing Ansible with Jenkins and DockerTesting Ansible with Jenkins and Docker
Testing Ansible with Jenkins and Docker
Dennis Rowe
 
SCS DevSecOps Seminar - State of DevSecOps
SCS DevSecOps Seminar - State of DevSecOpsSCS DevSecOps Seminar - State of DevSecOps
SCS DevSecOps Seminar - State of DevSecOps
Stefan Streichsbier
 
Problem solving
Problem solvingProblem solving
Problem solving
ho94949
 
3 клас урок 24 що таке команди, алгоритми та їх виконавці
3 клас урок 24 що таке команди, алгоритми та їх виконавці3 клас урок 24 що таке команди, алгоритми та їх виконавці
3 клас урок 24 що таке команди, алгоритми та їх виконавці
Сокальська ЗШ І-ІІІ ступенів №2
 
Docker introduction (1)
Docker introduction (1)Docker introduction (1)
Docker introduction (1)
Gourav Varma
 
RSpec & TDD Tutorial
RSpec & TDD TutorialRSpec & TDD Tutorial
RSpec & TDD Tutorial
Wen-Tien Chang
 
ArgoCD 的雷 碰過的人就知道 @TSMC IT Community Meetup #4
ArgoCD 的雷 碰過的人就知道 @TSMC IT Community Meetup #4ArgoCD 的雷 碰過的人就知道 @TSMC IT Community Meetup #4
ArgoCD 的雷 碰過的人就知道 @TSMC IT Community Meetup #4
Johnny Sung
 
[PreSchool-1] 프로그래밍 '개념' 맛보기
[PreSchool-1] 프로그래밍 '개념' 맛보기[PreSchool-1] 프로그래밍 '개념' 맛보기
[PreSchool-1] 프로그래밍 '개념' 맛보기
Young-Ho Cho
 
Docker by Example - Basics
Docker by Example - Basics Docker by Example - Basics
Docker by Example - Basics
CodeOps Technologies LLP
 
Календарне планування 8 клас - Інформатика - Python
Календарне планування 8 клас - Інформатика - PythonКалендарне планування 8 клас - Інформатика - Python
Календарне планування 8 клас - Інформатика - Python
VsimPPT
 
What is Docker?
What is Docker?What is Docker?
What is Docker?
Shubhrank Rastogi
 
середовище Lazarus
середовище Lazarusсередовище Lazarus
середовище Lazarus
Татьяна Ляш
 
DevOpsDays Taipei 2024 - Evolution of DevOps: Lessons Learned from a Growing ...
DevOpsDays Taipei 2024 - Evolution of DevOps: Lessons Learned from a Growing ...DevOpsDays Taipei 2024 - Evolution of DevOps: Lessons Learned from a Growing ...
DevOpsDays Taipei 2024 - Evolution of DevOps: Lessons Learned from a Growing ...
smalltown
 
用十分鐘搞懂《離散數學》
用十分鐘搞懂《離散數學》用十分鐘搞懂《離散數學》
用十分鐘搞懂《離散數學》
鍾誠 陳鍾誠
 
технологія пошуку інформації засобами мережі інтернет
технологія пошуку інформації засобами мережі інтернеттехнологія пошуку інформації засобами мережі інтернет
технологія пошуку інформації засобами мережі інтернет
Марина Белоцерковец
 
Презентація:Практична робота №2. Історія засобів опрацювання інформаційних об...
Презентація:Практична робота №2. Історія засобів опрацювання інформаційних об...Презентація:Практична робота №2. Історія засобів опрацювання інформаційних об...
Презентація:Практична робота №2. Історія засобів опрацювання інформаційних об...
sveta7940
 
Docker Introduction
Docker IntroductionDocker Introduction
Docker Introduction
Robert Reiz
 
8 Алгоритми з розгалуженнями для опрацювання величин
8 Алгоритми з розгалуженнями для опрацювання величин8 Алгоритми з розгалуженнями для опрацювання величин
8 Алгоритми з розгалуженнями для опрацювання величин
Сергій Каляфіцький
 
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword InformationEnriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
harmonylab
 
Docker Explained | What Is A Docker Container? | Docker Simplified | Docker T...
Docker Explained | What Is A Docker Container? | Docker Simplified | Docker T...Docker Explained | What Is A Docker Container? | Docker Simplified | Docker T...
Docker Explained | What Is A Docker Container? | Docker Simplified | Docker T...
Edureka!
 
Testing Ansible with Jenkins and Docker
Testing Ansible with Jenkins and DockerTesting Ansible with Jenkins and Docker
Testing Ansible with Jenkins and Docker
Dennis Rowe
 
SCS DevSecOps Seminar - State of DevSecOps
SCS DevSecOps Seminar - State of DevSecOpsSCS DevSecOps Seminar - State of DevSecOps
SCS DevSecOps Seminar - State of DevSecOps
Stefan Streichsbier
 
Problem solving
Problem solvingProblem solving
Problem solving
ho94949
 
Docker introduction (1)
Docker introduction (1)Docker introduction (1)
Docker introduction (1)
Gourav Varma
 
ArgoCD 的雷 碰過的人就知道 @TSMC IT Community Meetup #4
ArgoCD 的雷 碰過的人就知道 @TSMC IT Community Meetup #4ArgoCD 的雷 碰過的人就知道 @TSMC IT Community Meetup #4
ArgoCD 的雷 碰過的人就知道 @TSMC IT Community Meetup #4
Johnny Sung
 
[PreSchool-1] 프로그래밍 '개념' 맛보기
[PreSchool-1] 프로그래밍 '개념' 맛보기[PreSchool-1] 프로그래밍 '개념' 맛보기
[PreSchool-1] 프로그래밍 '개념' 맛보기
Young-Ho Cho
 
Календарне планування 8 клас - Інформатика - Python
Календарне планування 8 клас - Інформатика - PythonКалендарне планування 8 клас - Інформатика - Python
Календарне планування 8 клас - Інформатика - Python
VsimPPT
 

Viewers also liked (20)

AST - the only true tool for building JavaScript
AST - the only true tool for building JavaScriptAST - the only true tool for building JavaScript
AST - the only true tool for building JavaScript
Ingvar Stepanyan
 
Don't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax TreesDon't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax Trees
Jamund Ferguson
 
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred NichollsHardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Sencha
 
レインボーテーブルを使ったハッシュの復号とSalt
レインボーテーブルを使ったハッシュの復号とSaltレインボーテーブルを使ったハッシュの復号とSalt
レインボーテーブルを使ったハッシュの復号とSalt
Ryo Maruyama
 
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
ZongXian Shen
 
Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)
Yevgeniy Brikman
 
Graphics Processing Unit - GPU
Graphics Processing Unit - GPUGraphics Processing Unit - GPU
Graphics Processing Unit - GPU
Chetan Gole
 
Iocp 기본 구조 이해
Iocp 기본 구조 이해Iocp 기본 구조 이해
Iocp 기본 구조 이해
Nam Hyeonuk
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
Sandeep Singh
 
게임서버프로그래밍 #1 - IOCP
게임서버프로그래밍 #1 - IOCP게임서버프로그래밍 #1 - IOCP
게임서버프로그래밍 #1 - IOCP
Seungmo Koo
 
Chainerで学ぶdeep learning
Chainerで学ぶdeep learningChainerで学ぶdeep learning
Chainerで学ぶdeep learning
Retrieva inc.
 
헤테로지니어스 컴퓨팅 : CPU 에서 GPU 로 옮겨가기
헤테로지니어스 컴퓨팅 :  CPU 에서 GPU 로 옮겨가기헤테로지니어스 컴퓨팅 :  CPU 에서 GPU 로 옮겨가기
헤테로지니어스 컴퓨팅 : CPU 에서 GPU 로 옮겨가기
zupet
 
使用Javascript及HTML5打造協同運作系統
使用Javascript及HTML5打造協同運作系統使用Javascript及HTML5打造協同運作系統
使用Javascript及HTML5打造協同運作系統
Hsu Ping Feng
 
入門Gulp - 前端自動化開發工具
入門Gulp - 前端自動化開發工具入門Gulp - 前端自動化開發工具
入門Gulp - 前端自動化開發工具
Anna Su
 
webpack 入門
webpack 入門webpack 入門
webpack 入門
Anna Su
 
前端界流傳的神奇招式
前端界流傳的神奇招式前端界流傳的神奇招式
前端界流傳的神奇招式
Anna Su
 
Railway Oriented Programming
Railway Oriented ProgrammingRailway Oriented Programming
Railway Oriented Programming
Scott Wlaschin
 
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
Yusuke HIDESHIMA
 
FMK2015: The Power of JavaScript by Marcel Moré
FMK2015: The Power of JavaScript by Marcel MoréFMK2015: The Power of JavaScript by Marcel Moré
FMK2015: The Power of JavaScript by Marcel Moré
Verein FM Konferenz
 
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
Verein FM Konferenz
 
AST - the only true tool for building JavaScript
AST - the only true tool for building JavaScriptAST - the only true tool for building JavaScript
AST - the only true tool for building JavaScript
Ingvar Stepanyan
 
Don't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax TreesDon't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax Trees
Jamund Ferguson
 
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred NichollsHardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Sencha
 
レインボーテーブルを使ったハッシュの復号とSalt
レインボーテーブルを使ったハッシュの復号とSaltレインボーテーブルを使ったハッシュの復号とSalt
レインボーテーブルを使ったハッシュの復号とSalt
Ryo Maruyama
 
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
ZongXian Shen
 
Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)
Yevgeniy Brikman
 
Graphics Processing Unit - GPU
Graphics Processing Unit - GPUGraphics Processing Unit - GPU
Graphics Processing Unit - GPU
Chetan Gole
 
Iocp 기본 구조 이해
Iocp 기본 구조 이해Iocp 기본 구조 이해
Iocp 기본 구조 이해
Nam Hyeonuk
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
Sandeep Singh
 
게임서버프로그래밍 #1 - IOCP
게임서버프로그래밍 #1 - IOCP게임서버프로그래밍 #1 - IOCP
게임서버프로그래밍 #1 - IOCP
Seungmo Koo
 
Chainerで学ぶdeep learning
Chainerで学ぶdeep learningChainerで学ぶdeep learning
Chainerで学ぶdeep learning
Retrieva inc.
 
헤테로지니어스 컴퓨팅 : CPU 에서 GPU 로 옮겨가기
헤테로지니어스 컴퓨팅 :  CPU 에서 GPU 로 옮겨가기헤테로지니어스 컴퓨팅 :  CPU 에서 GPU 로 옮겨가기
헤테로지니어스 컴퓨팅 : CPU 에서 GPU 로 옮겨가기
zupet
 
使用Javascript及HTML5打造協同運作系統
使用Javascript及HTML5打造協同運作系統使用Javascript及HTML5打造協同運作系統
使用Javascript及HTML5打造協同運作系統
Hsu Ping Feng
 
入門Gulp - 前端自動化開發工具
入門Gulp - 前端自動化開發工具入門Gulp - 前端自動化開發工具
入門Gulp - 前端自動化開發工具
Anna Su
 
webpack 入門
webpack 入門webpack 入門
webpack 入門
Anna Su
 
前端界流傳的神奇招式
前端界流傳的神奇招式前端界流傳的神奇招式
前端界流傳的神奇招式
Anna Su
 
Railway Oriented Programming
Railway Oriented ProgrammingRailway Oriented Programming
Railway Oriented Programming
Scott Wlaschin
 
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
Yusuke HIDESHIMA
 
FMK2015: The Power of JavaScript by Marcel Moré
FMK2015: The Power of JavaScript by Marcel MoréFMK2015: The Power of JavaScript by Marcel Moré
FMK2015: The Power of JavaScript by Marcel Moré
Verein FM Konferenz
 
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
Verein FM Konferenz
 
Ad

Similar to JavaScript on the GPU (20)

Data Structures Handling Trillions of Daily Streaming Events by Evan Chan
Data Structures Handling Trillions of Daily Streaming Events by Evan ChanData Structures Handling Trillions of Daily Streaming Events by Evan Chan
Data Structures Handling Trillions of Daily Streaming Events by Evan Chan
ScyllaDB
 
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
DataWorks Summit/Hadoop Summit
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
lcplcp1
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
J On The Beach
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Databricks
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production
Paris Data Engineers !
 
Evolution of Spark APIs
Evolution of Spark APIsEvolution of Spark APIs
Evolution of Spark APIs
Máté Szalay-Bekő
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
NVIDIA Japan
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
Evan Chan
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Databricks
 
12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine
ChengHui Weng
 
Gpu Join Presentation
Gpu Join PresentationGpu Join Presentation
Gpu Join Presentation
Suman Karumuri
 
Nibin - Reverse Engineering for exploit writers - ClubHack2008
Nibin - Reverse Engineering for exploit writers - ClubHack2008Nibin - Reverse Engineering for exploit writers - ClubHack2008
Nibin - Reverse Engineering for exploit writers - ClubHack2008
ClubHack
 
Reverse Engineering for exploit writers
Reverse Engineering for exploit writersReverse Engineering for exploit writers
Reverse Engineering for exploit writers
amiable_indian
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
Fastly
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
Kohei KaiGai
 
XS Japan 2008 Xen Mgmt English
XS Japan 2008 Xen Mgmt EnglishXS Japan 2008 Xen Mgmt English
XS Japan 2008 Xen Mgmt English
The Linux Foundation
 
import rdma: zero-copy networking with RDMA and Python
import rdma: zero-copy networking with RDMA and Pythonimport rdma: zero-copy networking with RDMA and Python
import rdma: zero-copy networking with RDMA and Python
groveronline
 
Adventures in Thread-per-Core Async with Redpanda and Seastar
Adventures in Thread-per-Core Async with Redpanda and SeastarAdventures in Thread-per-Core Async with Redpanda and Seastar
Adventures in Thread-per-Core Async with Redpanda and Seastar
ScyllaDB
 
DotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETDotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NET
Maarten Balliauw
 
Data Structures Handling Trillions of Daily Streaming Events by Evan Chan
Data Structures Handling Trillions of Daily Streaming Events by Evan ChanData Structures Handling Trillions of Daily Streaming Events by Evan Chan
Data Structures Handling Trillions of Daily Streaming Events by Evan Chan
ScyllaDB
 
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
DataWorks Summit/Hadoop Summit
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
lcplcp1
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
J On The Beach
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Databricks
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production
Paris Data Engineers !
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
NVIDIA Japan
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
Evan Chan
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Databricks
 
12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine
ChengHui Weng
 
Nibin - Reverse Engineering for exploit writers - ClubHack2008
Nibin - Reverse Engineering for exploit writers - ClubHack2008Nibin - Reverse Engineering for exploit writers - ClubHack2008
Nibin - Reverse Engineering for exploit writers - ClubHack2008
ClubHack
 
Reverse Engineering for exploit writers
Reverse Engineering for exploit writersReverse Engineering for exploit writers
Reverse Engineering for exploit writers
amiable_indian
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
Fastly
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
Kohei KaiGai
 
import rdma: zero-copy networking with RDMA and Python
import rdma: zero-copy networking with RDMA and Pythonimport rdma: zero-copy networking with RDMA and Python
import rdma: zero-copy networking with RDMA and Python
groveronline
 
Adventures in Thread-per-Core Async with Redpanda and Seastar
Adventures in Thread-per-Core Async with Redpanda and SeastarAdventures in Thread-per-Core Async with Redpanda and Seastar
Adventures in Thread-per-Core Async with Redpanda and Seastar
ScyllaDB
 
DotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETDotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NET
Maarten Balliauw
 
Ad

Recently uploaded (20)

Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdfAre Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Telecoms Supermarket
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
Mastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdfMastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdf
Spiral Mantra
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdfAre Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Telecoms Supermarket
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
Mastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdfMastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdf
Spiral Mantra
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 

JavaScript on the GPU

  • 1. If you don’t get this ref...shame on you
  • 3. Work @ Sencha Web Platform Team Doing webkitty things...
  • 7. What I’ll blabber about today Why JavaScript on the GPU Running JavaScript on the GPU What’s to come...
  • 8. Why JavaScript on the GPU?
  • 9. Why JavaScript on the GPU? Better question: Why a GPU?
  • 10. Why JavaScript on the GPU? Better question: Why a GPU? A: They’re fast! (well, at certain things...)
  • 11. GPUs are fast b/c... Totally different paradigm from CPUs Data parallelism vs. Task parallelism Stream processing vs. Sequential processing GPUs can divide-and-conquer Hardware capable of a large number of “threads” e.g. ATI Radeon HD 6770m: 480 stream processing units == 480 cores Typically very high memory bandwidth Many, many GigaFLOPs
  • 12. GPUs don’t solve all problems Not all tasks can be accelerated by GPUs Tasks must be parallelizable, i.e.: Side effect free Homogeneous and/or streamable Overall tasks will become limited by Amdahl’s Law
  • 16. LateralJS Our Mission To make JavaScript a first-class citizen on all GPUs and take advantage of hardware accelerated operations & data parallelization.
  • 17. Our Options OpenCL Nvidia CUDA AMD, Nvidia, Intel, etc. Nvidia only A shitty version of C99 C++ (C for CUDA) No dynamic memory Dynamic memory No recursion Recursion No function pointers Function pointers Terrible tooling Great dev. tooling Immature (arguably) More mature (arguably)
  • 18. Our Options OpenCL Nvidia CUDA AMD, Nvidia, Intel, etc. Nvidia only A shitty version of C99 C++ (C for CUDA) No dynamic memory Dynamic memory No recursion Recursion No function pointers Function pointers Terrible tooling Great dev. tooling Immature (arguably) More mature (arguably)
  • 19. Why not a Static Compiler? We want full JavaScript support Object / prototype Closures Recursion Functions as objects Variable typing Type Inference limitations Reasonably limited to size and complexity of “kernel- esque” functions Not nearly insane enough
  • 21. Why an Interpreter? We want it all baby - full JavaScript support! Most insane approach Challenging to make it good, but holds a lot of promise
  • 24. Oh the agony... Multiple memory spaces - pointer hell No recursion - all inlined functions No standard libc libraries No dynamic memory No standard data structures - apart from vector ops Buggy ass AMD/Nvidia compilers
  • 26. Multiple Memory Spaces In the order of fastest to slowest: space description very fast private stream processor cache (~64KB) scoped to a single work item fast local ~= L1 cache on CPUs (~64KB) scoped to a single work group slow, by orders of magnitude global ~= system memory over slow bus constant available to all work groups/items all the VRAM on the card (MBs)
  • 27. Memory Space Pointer Hell global uchar* gptr = 0x1000; local uchar* lptr = (local uchar*) gptr; // FAIL! uchar* pptr = (uchar*) gptr; // FAIL! private is implicit 0x1000 global local private 0x1000 points to something different depending on the address space!
  • 28. Memory Space Pointer Hell Pointers must always be fully qualified Macros to help ease the pain #define GPTR(TYPE) global TYPE* #define CPTR(TYPE) constant TYPE* #define LPTR(TYPE) local TYPE* #define PPTR(TYPE) private TYPE*
  • 29. No Recursion!?!?!? No call stack All functions are inlined to the kernel function uint factorial(uint n) { if (n <= 1) return 1; else return n * factorial(n - 1); // compile-time error }
  • 30. No standard libc libraries memcpy? strcpy? strcmp? etc...
  • 31. No standard libc libraries Implement our own #define MEMCPY(NAME, DEST_AS, SRC_AS) DEST_AS void* NAME(DEST_AS void*, SRC_AS const void*, uint); DEST_AS void* NAME(DEST_AS void* dest, SRC_AS const void* src, uint size) { DEST_AS uchar* cDest = (DEST_AS uchar*)dest; SRC_AS const uchar* cSrc = (SRC_AS const uchar*)src; for (uint i = 0; i < size; i++) cDest[i] = cSrc[i]; return (DEST_AS void*)cDest; } PTR_MACRO_DEST_SRC(MEMCPY, memcpy) Produces memcpy_g memcpy_gc memcpy_lc memcpy_pc memcpy_l memcpy_gl memcpy_lg memcpy_pg memcpy_p memcpy_gp memcpy_lp memcpy_pl
  • 32. No dynamic memory No malloc() No free() What to do...
  • 33. Yes! dynamic memory Create a large buffer of global memory - our “heap” Implement our own malloc() and free() Create a handle structure - “virtual memory” P(T, hnd) macro to get the current pointer address GPTR(handle) hnd = malloc(sizeof(uint)); GPTR(uint) ptr = P(uint, hnd); *ptr = 0xdeadbeef; free(hnd);
  • 35. Ok, we get the point... FYL!
  • 36. High-level Architecture V8 Data Heap Esprima Parser Stack-based Interpreter Host Host Host GPUs Data Serializer & Marshaller Garbage Collector Device Mgr
  • 37. High-level Architecture eval(code); V8 Data Heap Build JSON AST Esprima Parser Stack-based Interpreter Host Host Host GPUs Data Serializer & Marshaller Garbage Collector Device Mgr
  • 38. High-level Architecture eval(code); V8 Data Heap Build JSON AST Esprima Parser Stack-based Interpreter Serialize AST Host Host Host JSON => C Structs GPUs Data Serializer & Marshaller Garbage Collector Device Mgr
  • 39. High-level Architecture eval(code); V8 Data Heap Build JSON AST Esprima Parser Stack-based Interpreter Serialize AST Host Host Host JSON => C Structs GPUs Data Serializer & Marshaller Garbage Collector Ship to GPU to Interpret Device Mgr
  • 40. High-level Architecture eval(code); V8 Data Heap Build JSON AST Esprima Parser Stack-based Interpreter Serialize AST Host Host Host JSON => C Structs GPUs Data Serializer & Marshaller Garbage Collector Ship to GPU to Interpret Device Mgr Fetch Result
  • 42. AST Generation JSON AST JavaScript Source (v8::Object) Lateral AST Esprima in V8 (C structs)
  • 43. Embed esprima.js Resource Generator $ resgen esprima.js resgen_esprima_js.c
  • 44. Embed esprima.js resgen_esprima_js.c const unsigned char resgen_esprima_js[] = { 0x2f, 0x2a, 0x0a, 0x20, 0x20, 0x43, 0x6f, 0x70, 0x79, 0x72, 0x69, 0x67, 0x68, 0x74, 0x20, 0x28, 0x43, 0x29, 0x20, 0x32, ... 0x20, 0x3a, 0x20, 0x2a, 0x2f, 0x0a, 0x0a, 0 };
  • 45. Embed esprima.js ASTGenerator.cpp extern const char resgen_esprima_js; void ASTGenerator::init() { HandleScope scope; s_context = Context::New(); s_context->Enter(); Handle<Script> script = Script::Compile(String::New(&resgen_esprima_js)); script->Run(); s_context->Exit(); s_initialized = true; }
  • 46. Build JSON AST e.g. ASTGenerator::esprimaParse( "var xyz = new Array(10);" );
  • 47. Build JSON AST Handle<Object> ASTGenerator::esprimaParse(const char* javascript) { if (!s_initialized) init(); HandleScope scope; s_context->Enter(); Handle<Object> global = s_context->Global(); Handle<Object> esprima = Handle<Object>::Cast(global->Get(String::New("esprima"))); Handle<Function> esprimaParse = Handle<Function>::Cast(esprima- >Get(String::New("parse"))); Handle<String> code = String::New(javascript); Handle<Object> ast = Handle<Object>::Cast(esprimaParse->Call(esprima, 1, (Handle<Value>*)&code)); s_context->Exit(); return scope.Close(ast); }
  • 48. Build JSON AST { "type": "VariableDeclaration", "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "xyz" }, "init": { "type": "NewExpression", "callee": { "type": "Identifier", "name": "Array" }, "arguments": [ { "type": "Literal", "value": 10 } ] } } ], "kind": "var" }
  • 49. Lateral AST structs typedef struct ast_type_st { #ifdef __OPENCL_VERSION__ CL(uint) id; #define CL(TYPE) TYPE CL(uint) size; #else } ast_type; #define CL(TYPE) cl_##TYPE #endif typedef struct ast_program_st { ast_type type; CL(uint) body; CL(uint) numBody; Structs shared between } ast_program; Host and OpenCL typedef struct ast_identifier_st { ast_type type; CL(uint) name; } ast_identifier;
  • 50. Lateral AST structs v8::Object => ast_type expanded ast_type* vd1_1_init_id = (ast_type*)astCreateIdentifier("Array"); ast_type* vd1_1_init_args[1]; vd1_1_init_args[0] = (ast_type*)astCreateNumberLiteral(10); ast_type* vd1_1_init = (ast_type*)astCreateNewExpression(vd1_1_init_id, vd1_1_init_args, 1); free(vd1_1_init_id); for (int i = 0; i < 1; i++) free(vd1_1_init_args[i]); ast_type* vd1_1_id = (ast_type*)astCreateIdentifier("xyz"); ast_type* vd1_decls[1]; vd1_decls[0] = (ast_type*)astCreateVariableDeclarator(vd1_1_id, vd1_1_init); free(vd1_1_id); free(vd1_1_init); ast_type* vd1 = (ast_type*)astCreateVariableDeclaration(vd1_decls, 1, "var"); for (int i = 0; i < 1; i++) free(vd1_decls[i]);
  • 51. Lateral AST structs astCreateIdentifier ast_identifier* astCreateIdentifier(const char* str) { CL(uint) size = sizeof(ast_identifier) + rnd(strlen(str) + 1, 4); ast_identifier* ast_id = (ast_identifier*)malloc(size); // copy the string strcpy((char*)(ast_id + 1), str); // fill the struct ast_id->type.id = AST_IDENTIFIER; ast_id->type.size = size; ast_id->name = sizeof(ast_identifier); // offset return ast_id; }
  • 52. Lateral AST structs astCreateIdentifier(“xyz”) offset field value 0 type.id AST_IDENTIFIER (0x01) 4 type.size 16 8 name 12 (offset) 12 str[0] ‘x’ 13 str[1] ‘y’ 14 str[2] ‘z’ 15 str[3] ‘0’
  • 53. Lateral AST structs astCreateNewExpression ast_expression_new* astCreateNewExpression(ast_type* callee, ast_type** arguments, int numArgs) { CL(uint) size = sizeof(ast_expression_new) + callee->size; for (int i = 0; i < numArgs; i++) size += arguments[i]->size; ast_expression_new* ast_new = (ast_expression_new*)malloc(size); ast_new->type.id = AST_NEW_EXPR; ast_new->type.size = size; CL(uint) offset = sizeof(ast_expression_new); char* dest = (char*)ast_new; // copy callee memcpy(dest + offset, callee, callee->size); ast_new->callee = offset; offset += callee->size; // copy arguments if (numArgs) { ast_new->arguments = offset; for (int i = 0; i < numArgs; i++) { ast_type* arg = arguments[i]; memcpy(dest + offset, arg, arg->size); offset += arg->size; } } else ast_new->arguments = 0; ast_new->numArguments = numArgs; return ast_new; }
  • 54. Lateral AST structs new Array(10) offset field value 0 type.id AST_NEW_EXPR (0x308) 4 type.size 52 8 callee 20 (offset) 12 arguments 40 (offset) 16 numArguments 1 20 callee node ast_identifier (“Array”) arguments 40 ast_literal_number (10) node
  • 55. Lateral AST structs Shared across the Host and the OpenCL runtime Host writes, Lateral reads Constructed on Host as contiguous blobs Easy to send to GPU: memcpy(gpu, ast, ast->size); Fast to send to GPU, single buffer write Simple to traverse w/ pointer arithmetic
  • 57. Building Blocks JS Type Structs AST Traverse Stack Lateral State Call/Exec Stack Heap Symbol/Ref Table Return Stack Scope Stack AST Traverse Loop Interpret Loop
  • 58. Kernels #include "state.h" #include "jsvm/asttraverse.h" #include "jsvm/interpreter.h" // Setup VM structures kernel void lateral_init(GPTR(uchar) lateral_heap) { LATERAL_STATE_INIT } // Interpret the AST kernel void lateral(GPTR(uchar) lateral_heap, GPTR(ast_type) lateral_ast) { LATERAL_STATE ast_push(lateral_ast); while (!Q_EMPTY(lateral_state->ast_stack, ast_q) || !Q_EMPTY(lateral_state->call_stack, call_q)) { while (!Q_EMPTY(lateral_state->ast_stack, ast_q)) traverse(); if (!Q_EMPTY(lateral_state->call_stack, call_q)) interpret(); } }
  • 60. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 61. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDecl "id": { "type": "Identifier", "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 62. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor "id": { "type": "Identifier", "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 63. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", Ident VarDtor "id": { "type": "Identifier", Binary "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 64. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", Ident VarDtor "id": { "type": "Identifier", Literal Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 65. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", Ident VarDtor "id": { "type": "Identifier", Literal Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 66. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", Ident VarDtor "id": { "type": "Identifier", Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", Literal "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 67. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor "id": { "type": "Identifier", Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", Literal "operator": "+", "left": { Ident "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 68. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor “x” "id": { "type": "Identifier", Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", Literal "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 69. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor “x” "id": { "type": "Identifier", Binary 1 }, "name": "x" Literal "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 70. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor “x” "id": { "type": "Identifier", Binary 1 }, "name": "x" 2 "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 71. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor “x” "id": { "type": "Identifier", 3 "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 72. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 74. Benchmark Small loop of FLOPs var input = new Array(10); for (var i = 0; i < input.length; i++) { input[i] = Math.pow((i + 1) / 1.23, 3); }
  • 75. Execution Time Lateral GPU CL CPU CL V8 ATI Radeon 6770m Intel Core i7 4x2.4Ghz Intel Core i7 4x2.4Ghz 116.571533ms 0.226007ms 0.090664ms
  • 76. Execution Time Lateral GPU CL CPU CL V8 ATI Radeon 6770m Intel Core i7 4x2.4Ghz Intel Core i7 4x2.4Ghz 116.571533ms 0.226007ms 0.090664ms
  • 78. What went wrong? Everything Stack-based AST Interpreter, no optimizations Heavy global memory access, no optimizations No data or task parallelism
  • 79. Stack-based Interpreter Slow as molasses Memory hog Eclipse style Heavy memory access “var x = 1 + 2;” == 30 stack hits alone! Too much dynamic allocation No inline optimizations, just following the yellow brick AST Straight up lazy Replace with something better! Bytecode compiler on Host Bytecode register-based interpreter on Device
  • 81. Too much global access Everything is dynamically allocated to global memory Register based interpreter & bytecode compiler can make better use of local and private memory // 11.1207 seconds size_t tid = get_global_id(0); c[tid] = a[tid]; while(b[tid] > 0) { // touch global memory on each loop b[tid]--; // touch global memory on each loop c[tid]++; // touch global memory on each loop Optimizing memory access } // 0.0445558 seconds!! HOLY SHIT! yields crazy results size_t tid = get_global_id(0); int tmp = a[tid]; // temp private variable for(int i=b[tid]; i > 0; i--) tmp++; // touch private variables on each loop c[tid] = tmp; // touch global memory one time
  • 82. No data or task parallelism Everything being interpreted in a single “thread” We have hundreds of cores available to us! Build in heuristics Identify side-effect free statements Break into parallel tasks - very magical input[0] = Math.pow((0 + 1) / 1.23, 3); var input = new Array(10); for (var i = 0; i < input.length; i++) { input[1] = Math.pow((1 + 1) / 1.23, 3); } input[i] = Math.pow((i + 1) / 1.23, 3); ... input[9] = Math.pow((9 + 1) / 1.23, 3);
  • 83. What’s in store Acceptable performance on all CL devices V8/Node extension to launch Lateral tasks High-level API to perform map-reduce, etc. Lateral-cluster...mmmmm
  • 84. Thanks! Jarred Nicholls @jarrednicholls [email protected]