Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Fwdays
Here in DS team in WIX we want to help to create stunning sites by applying recent achievement of AI research to production. Since Data Science engineering practices are still not fully shaped we found out that it is crucial to bring the best practices from software engineering - give Data Scientist ability to deliver models fast without loss in quality and computation efficiency to stay competitive in this overhyped market. To achieve this we are developing our own infrastructure for creating pipelines and deploying them to production with minimum (to none) engineer involvement.
This talk will cover initial motivation, solved technical issues and lessons learned while building such ML delivery system.
Website: https://fwdays.com/en/event/data-science-fwdays-2019/review/continuous-delivery-of-ml-pipelines-to-production
HyperGraphDB is a database that uses hypergraphs instead of graphs to represent data. It allows relationships between any number of objects instead of just pairs of objects. This more powerful representation allows for more compact and natural modeling of data. HyperGraphDB has applications in artificial intelligence, computational biology, knowledge bases, and more. It provides a type system where types are represented as objects, allowing dynamic extension of the schema.
Concept Drift: Monitoring Model Quality In Streaming ML ApplicationsLightbend
Most machine learning algorithms are designed to work with stationary data. Yet, real-life streaming data is rarely stationary. Machine learned models built on data observed within a fixed time period usually suffer loss of prediction quality due to what is known as concept drift.
The most common method to deal with concept drift is periodically retraining the models with new data. The length of the period is usually determined based on cost of retraining. The changes in the input data and the quality of predictions are not monitored, and the cost of inaccurate predictions is not included in these calculations.
A better alternative is monitoring the model quality by testing the inputs and predictions for changes over time, and using change points in retraining decisions. There has been significant development in this area within the last two decades.
In this webinar, Emre Velipasaoglu, Principal Data Scientist at Lightbend, Inc., will review the successful methods of machine learned model quality monitoring.
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this session, we introduce MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size. In this deep-dive session, through a complete ML model life-cycle example, you will walk away with:
MLflow concepts and abstractions for models, experiments, and projects
How to get started with MLFlow
Understand aspects of MLflow APIs
Using tracking APIs during model training
Using MLflow UI to visually compare and contrast experimental runs with different tuning parameters and evaluate metrics
Package, save, and deploy an MLflow model
Serve it using MLflow REST API
What’s next and how to contribute
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Databricks
The document summarizes a presentation about state-of-the-art natural language processing (NLP) techniques. It discusses how transformer networks have achieved state-of-the-art results in many NLP tasks using transfer learning from large pre-trained models. It also describes how Hugging Face's Transformers library and Tokenizers library provide tools for tokenization and using pre-trained transformer models through a simple interface.
My talk from SICS Data Science Day, describing FlinkML, the Machine Learning library for Apache Flink.
I talk about our approach to large-scale machine learning and how we utilize state-of-the-art algorithms to ensure FlinkML is a truly scalable library.
You can watch a video of the talk here: https://youtu.be/k29qoCm4c_k
Introductory presentation for the Clash of Technologies: RxJS vs RxJava event organized by SoftServe @ betahouse (17.01.2015). Comparison document with questions & answers available here: https://docs.google.com/document/d/1VhuXJUcILsMSP4_6pCCXBP0X5lEVTsmLivKHcUkFvFY/edit#.
This document compares three high-level programming languages for Big Data analytics on Hadoop clusters: Pig Latin, HiveQL, and Jaql. It analyzes and compares the languages based on four criteria: expressive power, performance, query processing methods, and how each language implements joins. The document finds that while each language has strengths in certain areas, no single language is superior in all criteria. Developers must consider the unique aspects of each language and criteria that matter most for their specific applications and datasets.
This covers details of the processes of compilation. A lot of extra teaching support is required with these.
Originally written for AQA A level Computing (UK exam).
The document provides an overview of deep learning concepts and techniques for natural language processing tasks. It includes the following:
1. A schedule for a deep learning workshop covering fundamentals of deep learning for machine translation, word embeddings, neural language models, and neural machine translation.
2. Descriptions of neural networks, activation functions, backpropagation, and word embeddings.
3. Details about feedforward neural network language models, recurrent neural network language models, and how they are applied to tasks like language modeling and machine translation.
4. An explanation of attention-based encoder-decoder models for neural machine translation.
Natural Language to Visualization by Neural Machine Translationivaderivader
This document summarizes a method called ncNet that uses neural machine translation to translate natural language queries to Vega-Zero visualization specifications. Key points include:
- ncNet is a Transformer-based seq2seq model that can translate natural language queries about data into specifications for visualizations.
- It uses techniques like attention forcing and visualization-aware translation to generate valid Vega-Zero outputs from the queries.
- An evaluation compares ncNet to existing methods on a dataset of natural language queries paired with Vega-Zero specifications, finding it outperforms baselines.
The document discusses declarative programming as it relates to network programmability. It provides examples of declarative versus imperative code and explains key concepts of declarative programming like lack of side effects, referential transparency, and idempotence. It also discusses how declarative programming can provide benefits like robustness, scalability, and reusability for network systems, which often operate in uncertain distributed environments. Finally, it outlines some declarative programming approaches being used for network control, orchestration, and automation.
This document discusses parallel programming and some common approaches used in .NET. It explains that parallel programming involves partitioning work into chunks that can execute concurrently on multiple threads or processor cores. It then describes several common .NET APIs for parallel programming: the Task Parallel Library (TPL) for general parallelism, PLINQ for parallel LINQ queries, Parallel class methods for data parallelism, and lower-level task parallelism using the Task class.
This document provides an overview of message passing computing and the Message Passing Interface (MPI) library. It discusses message passing concepts, the Single Program Multiple Data (SPMD) model, point-to-point communication using send and receive routines, message tags, communicators, debugging tools, and evaluating performance through timing. Key points covered include how MPI defines a standard for message passing between processes, common routines like MPI_Send and MPI_Recv, and how to compile and execute MPI programs on multiple computers.
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...Flink Forward
As a low-latency streaming tool, Flink offers the possibility of using machine learning, even "deep learning" (neural networks), with low latency. The growing FlinkML library provides some of the infrastructure support required for this goal, combined with third-party tools. This talk is a progress report on several scenarios we are developing at Lightbend, which combine Flink, Deeplearning4J, Spark, and Kafka to analyze cluster telemetry for anomaly detection, predictive autoscaling, and other scenarios. I'll focus on the pragmatics of training deep learning models in a streaming context, using batch and mini-batch training, combined with low-latency application of those models. I'll discuss the architecture we're using and highlight trade offs of particular tools for certain design problems in the implementation. I'll discuss the drawbacks and workarounds of our design and finish with a look at how future developments in Flink could improve its support for scenarios like ours.
Source-to-source transformations: Supporting tools and infrastructurekaveirious
Introduction to source-to-source transformation. Concept and overview. Basics of existing tools (TXL, ROSE, Cetus, EDG, C-to-C, Memphis); pros and cons. Part of an internal evaluation for selecting a source-to-source transformation tool.
Transfer learning in NLP involves pre-training large language models on unlabeled text and then fine-tuning them on downstream tasks. Current state-of-the-art models such as BERT, GPT-2, and XLNet use bidirectional transformers pretrained using techniques like masked language modeling. These models have billions of parameters and require huge amounts of compute but have achieved SOTA results on many NLP tasks. Researchers are exploring ways to reduce model sizes through techniques like distillation while maintaining high performance. Open questions remain around model interpretability and generalization.
Neural machine translation has surpassed statistical machine translation as the leading approach. It uses an encoder-decoder model with attention to learn translation representations from large parallel corpora. Recent developments include incorporating monolingual data through language models, improving attention mechanisms, and minimizing evaluation metrics like BLEU during training rather than just cross-entropy. Open problems remain around handling rare words, semantic meaning, and context. Future work may focus on multilingual models, low-resource translation, and generating text for other modalities like images.
Overview of the SPARQL-Generate language and latest developmentsMaxime Lefrançois
SPARQL-Generate is an extension of SPARQL 1.1 for querying not only RDF datasets but also documents in arbitrary formats. The solution bindings can then be used to output RDF (SPARQL-Generate) or text (SPARQL-Template)
Anyone familiar with SPARQL can easily learn SPARQL-Generate; Learning SPARQL-Generate helps you learning SPARQL.
The open-source implementation (Apache 2 license) is based on Apache Jena and can be used to execute transformations from a combination of RDF and any kind of documents in XML, JSON, CSV, HTML, GeoJSON, CBOR, streams of messages using WebSocket or MQTT... (easily extensible)
Recent extensions and improvement include:
- heavy refactoring to support parallelization
- more expressive iterators and functions
- simple generation of RDF lists
- support of aggregates
- generation of HDT (thanks Ana for the use case)
- partial implementation of STTL for the generation of Text (https://ns.inria.fr/sparql-template/)
- partial implementation of LDScript (http://ns.inria.fr/sparql-extension/)
- integration of all these types of rules to decouple or compose queries, e.g.:
- call a SPARQL-Generate query in the SPARQL FROM clause
- plug a SPARQL-Generate or a SPARQL-Template query to the output of a SPARQL-
Select function
- a Sublime Text package for local development
This document discusses implementing a parallel merge sort algorithm using MPI (Message Passing Interface). It describes the background of MPI and how it can be used for communication between processes. It provides details on the dataset used, MPI functions for initialization, communication between processes, and summarizes the results which show a decrease in runtime when increasing the number of processors.
This document summarizes a presentation about using the Task Parallel Library (TPL) for data flow tasks in .NET. It discusses how TPL can be used to parallelize image processing pipelines by modeling the stages as data flow blocks. The key TPL data flow blocks for sources, targets, buffering, transformations, and joins are explained. Code examples are provided for building a skeletal image processing program using these TPL data flow capabilities.
The document provides an overview of mpiJava, an open-source software package that provides Java wrappers for the Message Passing Interface (MPI) through the Java Native Interface. MpiJava implements a Java API for MPI and was one of the early efforts to bring message passing capabilities to Java for high-performance and distributed computing. The summary discusses mpiJava's implementation, API design, usage, and programming model.
This document discusses decision making and loops in Python. It begins with an introduction to decision making using if/else statements and examples of checking conditions. It then covers different types of loops - for, while, and do-while loops. The for loop is used when the number of iterations is known, while the while loop is used when it is unknown. It provides examples of using range() with for loops and examples of while loops.
Data lineage has gained popularity in the Machine Learning community as a way to make models and datasets easier to interpret and to help developers debug their ML pipelines by enabling them to go from a model to the dataset/user who trained it. Data provenance and lineage is the process of building up the history of how a data artifact came to be. This history of derivations and interactions can provide a better context for data discovery, debugging, as well as auditing. In this area, others, such as Google and Databricks, have made small steps.
The Hopsworks approach presented provenance information is collected implicitly through the unobtrusive instrumentation of jupyter notebooks and python code - What we call 'implicit provenance'.
GVCs and Sustainable Development InteractionUlker Aliyeva
The document discusses global value chains (GVCs) and their interaction with sustainable development. It provides an overview of GVCs, defining them and outlining trends like the rise of intermediate goods trade. It then examines the research methods used and discusses findings around the impacts of GVC participation, including economic impacts like job creation and value capture, social impacts on working conditions, and environmental impacts. The conclusion emphasizes the need for a balanced approach between business and policy goals to promote sustainable development through GVCs.
My talk from SICS Data Science Day, describing FlinkML, the Machine Learning library for Apache Flink.
I talk about our approach to large-scale machine learning and how we utilize state-of-the-art algorithms to ensure FlinkML is a truly scalable library.
You can watch a video of the talk here: https://youtu.be/k29qoCm4c_k
Introductory presentation for the Clash of Technologies: RxJS vs RxJava event organized by SoftServe @ betahouse (17.01.2015). Comparison document with questions & answers available here: https://docs.google.com/document/d/1VhuXJUcILsMSP4_6pCCXBP0X5lEVTsmLivKHcUkFvFY/edit#.
This document compares three high-level programming languages for Big Data analytics on Hadoop clusters: Pig Latin, HiveQL, and Jaql. It analyzes and compares the languages based on four criteria: expressive power, performance, query processing methods, and how each language implements joins. The document finds that while each language has strengths in certain areas, no single language is superior in all criteria. Developers must consider the unique aspects of each language and criteria that matter most for their specific applications and datasets.
This covers details of the processes of compilation. A lot of extra teaching support is required with these.
Originally written for AQA A level Computing (UK exam).
The document provides an overview of deep learning concepts and techniques for natural language processing tasks. It includes the following:
1. A schedule for a deep learning workshop covering fundamentals of deep learning for machine translation, word embeddings, neural language models, and neural machine translation.
2. Descriptions of neural networks, activation functions, backpropagation, and word embeddings.
3. Details about feedforward neural network language models, recurrent neural network language models, and how they are applied to tasks like language modeling and machine translation.
4. An explanation of attention-based encoder-decoder models for neural machine translation.
Natural Language to Visualization by Neural Machine Translationivaderivader
This document summarizes a method called ncNet that uses neural machine translation to translate natural language queries to Vega-Zero visualization specifications. Key points include:
- ncNet is a Transformer-based seq2seq model that can translate natural language queries about data into specifications for visualizations.
- It uses techniques like attention forcing and visualization-aware translation to generate valid Vega-Zero outputs from the queries.
- An evaluation compares ncNet to existing methods on a dataset of natural language queries paired with Vega-Zero specifications, finding it outperforms baselines.
The document discusses declarative programming as it relates to network programmability. It provides examples of declarative versus imperative code and explains key concepts of declarative programming like lack of side effects, referential transparency, and idempotence. It also discusses how declarative programming can provide benefits like robustness, scalability, and reusability for network systems, which often operate in uncertain distributed environments. Finally, it outlines some declarative programming approaches being used for network control, orchestration, and automation.
This document discusses parallel programming and some common approaches used in .NET. It explains that parallel programming involves partitioning work into chunks that can execute concurrently on multiple threads or processor cores. It then describes several common .NET APIs for parallel programming: the Task Parallel Library (TPL) for general parallelism, PLINQ for parallel LINQ queries, Parallel class methods for data parallelism, and lower-level task parallelism using the Task class.
This document provides an overview of message passing computing and the Message Passing Interface (MPI) library. It discusses message passing concepts, the Single Program Multiple Data (SPMD) model, point-to-point communication using send and receive routines, message tags, communicators, debugging tools, and evaluating performance through timing. Key points covered include how MPI defines a standard for message passing between processes, common routines like MPI_Send and MPI_Recv, and how to compile and execute MPI programs on multiple computers.
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...Flink Forward
As a low-latency streaming tool, Flink offers the possibility of using machine learning, even "deep learning" (neural networks), with low latency. The growing FlinkML library provides some of the infrastructure support required for this goal, combined with third-party tools. This talk is a progress report on several scenarios we are developing at Lightbend, which combine Flink, Deeplearning4J, Spark, and Kafka to analyze cluster telemetry for anomaly detection, predictive autoscaling, and other scenarios. I'll focus on the pragmatics of training deep learning models in a streaming context, using batch and mini-batch training, combined with low-latency application of those models. I'll discuss the architecture we're using and highlight trade offs of particular tools for certain design problems in the implementation. I'll discuss the drawbacks and workarounds of our design and finish with a look at how future developments in Flink could improve its support for scenarios like ours.
Source-to-source transformations: Supporting tools and infrastructurekaveirious
Introduction to source-to-source transformation. Concept and overview. Basics of existing tools (TXL, ROSE, Cetus, EDG, C-to-C, Memphis); pros and cons. Part of an internal evaluation for selecting a source-to-source transformation tool.
Transfer learning in NLP involves pre-training large language models on unlabeled text and then fine-tuning them on downstream tasks. Current state-of-the-art models such as BERT, GPT-2, and XLNet use bidirectional transformers pretrained using techniques like masked language modeling. These models have billions of parameters and require huge amounts of compute but have achieved SOTA results on many NLP tasks. Researchers are exploring ways to reduce model sizes through techniques like distillation while maintaining high performance. Open questions remain around model interpretability and generalization.
Neural machine translation has surpassed statistical machine translation as the leading approach. It uses an encoder-decoder model with attention to learn translation representations from large parallel corpora. Recent developments include incorporating monolingual data through language models, improving attention mechanisms, and minimizing evaluation metrics like BLEU during training rather than just cross-entropy. Open problems remain around handling rare words, semantic meaning, and context. Future work may focus on multilingual models, low-resource translation, and generating text for other modalities like images.
Overview of the SPARQL-Generate language and latest developmentsMaxime Lefrançois
SPARQL-Generate is an extension of SPARQL 1.1 for querying not only RDF datasets but also documents in arbitrary formats. The solution bindings can then be used to output RDF (SPARQL-Generate) or text (SPARQL-Template)
Anyone familiar with SPARQL can easily learn SPARQL-Generate; Learning SPARQL-Generate helps you learning SPARQL.
The open-source implementation (Apache 2 license) is based on Apache Jena and can be used to execute transformations from a combination of RDF and any kind of documents in XML, JSON, CSV, HTML, GeoJSON, CBOR, streams of messages using WebSocket or MQTT... (easily extensible)
Recent extensions and improvement include:
- heavy refactoring to support parallelization
- more expressive iterators and functions
- simple generation of RDF lists
- support of aggregates
- generation of HDT (thanks Ana for the use case)
- partial implementation of STTL for the generation of Text (https://ns.inria.fr/sparql-template/)
- partial implementation of LDScript (http://ns.inria.fr/sparql-extension/)
- integration of all these types of rules to decouple or compose queries, e.g.:
- call a SPARQL-Generate query in the SPARQL FROM clause
- plug a SPARQL-Generate or a SPARQL-Template query to the output of a SPARQL-
Select function
- a Sublime Text package for local development
This document discusses implementing a parallel merge sort algorithm using MPI (Message Passing Interface). It describes the background of MPI and how it can be used for communication between processes. It provides details on the dataset used, MPI functions for initialization, communication between processes, and summarizes the results which show a decrease in runtime when increasing the number of processors.
This document summarizes a presentation about using the Task Parallel Library (TPL) for data flow tasks in .NET. It discusses how TPL can be used to parallelize image processing pipelines by modeling the stages as data flow blocks. The key TPL data flow blocks for sources, targets, buffering, transformations, and joins are explained. Code examples are provided for building a skeletal image processing program using these TPL data flow capabilities.
The document provides an overview of mpiJava, an open-source software package that provides Java wrappers for the Message Passing Interface (MPI) through the Java Native Interface. MpiJava implements a Java API for MPI and was one of the early efforts to bring message passing capabilities to Java for high-performance and distributed computing. The summary discusses mpiJava's implementation, API design, usage, and programming model.
This document discusses decision making and loops in Python. It begins with an introduction to decision making using if/else statements and examples of checking conditions. It then covers different types of loops - for, while, and do-while loops. The for loop is used when the number of iterations is known, while the while loop is used when it is unknown. It provides examples of using range() with for loops and examples of while loops.
Data lineage has gained popularity in the Machine Learning community as a way to make models and datasets easier to interpret and to help developers debug their ML pipelines by enabling them to go from a model to the dataset/user who trained it. Data provenance and lineage is the process of building up the history of how a data artifact came to be. This history of derivations and interactions can provide a better context for data discovery, debugging, as well as auditing. In this area, others, such as Google and Databricks, have made small steps.
The Hopsworks approach presented provenance information is collected implicitly through the unobtrusive instrumentation of jupyter notebooks and python code - What we call 'implicit provenance'.
GVCs and Sustainable Development InteractionUlker Aliyeva
The document discusses global value chains (GVCs) and their interaction with sustainable development. It provides an overview of GVCs, defining them and outlining trends like the rise of intermediate goods trade. It then examines the research methods used and discusses findings around the impacts of GVC participation, including economic impacts like job creation and value capture, social impacts on working conditions, and environmental impacts. The conclusion emphasizes the need for a balanced approach between business and policy goals to promote sustainable development through GVCs.
O documento lista 25 das frutas mais saudáveis, descrevendo brevemente seu valor nutricional, potencial benefícios para a saúde e dicas interessantes sobre cada uma. As frutas destacadas incluem maçãs, abacates, bananas e berries como amoras, mirtilos e framboesas, reconhecidas por seus altos níveis de antioxidantes e potencial para combater doenças.
1. Fluorescent nanodiamonds containing nitrogen-vacancy centers are a versatile tool for long-term cell tracking, super-resolution imaging, and nanoscale temperature sensing due to their unique optical and magnetic properties.
2. Fluorescent nanodiamonds can be internalized by cells without toxicity and remain inside cells for long-term tracking over weeks as cells divide and proliferate. They have been used to track lung stem cells transplanted in mouse models.
3. As photostable fluorophores without blinking or photobleaching, fluorescent nanodiamonds are suitable for super-resolution imaging techniques like STED that can achieve a 20-fold improvement in spatial resolution, crucial for revealing biological structures
Are you considering adding a new furry friend to your home? Cats make the perfect companions for both children and adults. It is ideal to adopt a cat rather than purchasing one from a store. This presentation gives you 7 reasons why you should considering adopting a cat from your local shelter.
Terry Taylor is seeking a procurement position and has over 16 years of experience in purchasing, logistics, and inventory control. He has 14 years of experience as a Lawson ERP consultant, providing implementation, training, customization, and ongoing support services to healthcare clients across various functions and locations. He has comprehensive skills and certifications in Lawson, IT technologies, and business applications.
This document outlines the key characteristics and design issues of distributed systems. It defines a distributed system as one where components located at networked computers communicate and coordinate through passing messages. Key characteristics include having multiple autonomous components, non-shared resources, concurrent processes across nodes, and multiple points of failure. The document also discusses common distributed system design issues such as naming, communication, and system architecture. Examples of distributed systems include local area networks, database systems, and the Internet.
Maxime Petazzoni gave a presentation on Docker at SignalFx. He discussed how SignalFx uses Docker for infrastructure separation, development lifecycles, application packaging and delivery, and orchestration. He explained how SignalFx monitors Docker containers using CollectD and their Docker plugin to collect host and container metrics. SignalFx provides curated dashboards and anomaly detection for correlated system, container and application metrics.
El documento discute cómo las redes sociales y la marca personal pueden usarse para buscar trabajo de manera efectiva. Presenta varios desafíos del mercado laboral actual y futuro, y destaca la importancia de desarrollar competencias digitales y de comunicación para tener éxito en la búsqueda de empleo.
Swift is a parallel scripting language that makes parallelism, failure recovery, and computing location transparent for running loosely-coupled application programs and utilities linked by exchanging files on clusters, clouds, and grids. It allows scripts to run on multiple distributed sites and diverse computing resources in a simple way. The example MODIS script processes land use images in parallel across different resources without needing to specify specific sites, and Swift's location independence lets users focus on science instead of configuration details.
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
The Swift scripting language was created to provide a simple, compact way to write parallel scripts that run many copies of ordinary programs concurrently in various workflow patterns, reducing the need for complex parallel programming or arcane scripting to achieve this common high-level task. The result was a highly portable programming model based on implicitly parallel functional dataflow. The same Swift script runs on multi-core computers, clusters, grids, clouds, and supercomputers, and is thus a useful tool for moving workflow computations from laptop to distributed and/or high performance systems.
Swift has proven to be very general, and is in use in domains ranging from earth systems to bioinformatics to molecular modeling. It’s more recently been adapted to serve as a programming model for much finer-grain in-memory workflow on extreme scale systems, where it can perform task rates in the millions to billion-per-second.
In this talk, we describe the state of Swift’s implementation, present several Swift applications, and discuss ideas for of the future evolution of the programming model on which it’s based.
Building and deploying LLM applications with Apache AirflowKaxil Naik
Behind the growing interest in Generate AI and LLM-based enterprise applications lies an expanded set of requirements for data integrations and ML orchestration. Enterprises want to use proprietary data to power LLM-based applications that create new business value, but they face challenges in moving beyond experimentation. The pipelines that power these models need to run reliably at scale, bringing together data from many sources and reacting continuously to changing conditions.
This talk focuses on the design patterns for using Apache Airflow to support LLM applications created using private enterprise data. We’ll go through a real-world example of what this looks like, as well as a proposal to improve Airflow and to add additional Airflow Providers to make it easier to interact with LLMs such as the ones from OpenAI (such as GPT4) and the ones on HuggingFace, while working with both structured and unstructured data.
In short, this shows how these Airflow patterns enable reliable, traceable, and scalable LLM applications within the enterprise.
https://airflowsummit.org/sessions/2023/keynote-llm/
Concurrency Programming in Java - 01 - Introduction to Concurrency ProgrammingSachintha Gunasena
This session discusses a basic high-level introduction to concurrency programming with Java which include:
programming basics, OOP concepts, concurrency, concurrent programming, parallel computing, concurrent vs parallel, why concurrency, real world example, terms, Moore's Law, Amdahl's Law, types of parallel computation, MIMD Variants, shared memory model, distributed memory model, client server model, scoop mechanism, scoop preview - a sequential program, in a concurrent setting - using scoop, programming then & now, sequential programming, concurrent programming,
C++ is an object-oriented programming language that is an extension of C. It allows for data abstraction through the use of classes and objects. Some key features of C++ include data encapsulation, inheritance, polymorphism, and reusability. C++ is a mid-level language that is compiled, supports pointers, and has a rich standard library. It is commonly used for system applications, games, and other performance-critical software due to its speed and ability to interface with hardware. Some example applications include operating systems, browsers, databases, and graphics/game engines.
Swift is a new programming language developed by Apple as a replacement for Objective-C. It incorporates modern programming language design and borrows concepts from other languages like Objective-C, Rust, Haskell, Ruby, Python, C#, CLU, and more. Swift code is compiled with the LLVM compiler to produce optimized native code and works seamlessly with existing Objective-C code and Cocoa frameworks. It focuses on performance, safety, and ease of use through features like type safety, modern control flow syntax, and interactive playgrounds.
Near real-time anomaly detection at Lyftmarkgrover
Near real-time anomaly detection at Lyft, by Mark Grover and Thomas Weise at Strata NY 2018.
https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/69155
Declarative Programming and a form of SDN Miya Kohno
The document discusses declarative programming as it relates to network programmability. It provides examples of declarative versus imperative code and explains key concepts of declarative programming like lack of side effects, referential transparency, and idempotence. It also discusses how declarative programming could be beneficial for networking given its robustness in complex distributed environments but may lack universal computational power. OpenDaylight and ETSI NFV architectures are presented as examples combining declarative and imperative approaches.
This document discusses MapReduce and Big Data processing using ZeroVM, a lightweight virtualization platform. It provides an overview of MapReduce and how it is commonly implemented using Apache Hadoop. It then describes some limitations of running MapReduce on the cloud, including costly data transfers between storage and computing clusters. The document introduces ZeroVM as a way to run applications directly on storage clusters, avoiding these transfers. It outlines how ZeroVM enables MapReduce jobs to be run on the storage layer through its ZeroCloud module. Ongoing research at UTSA is further developing ZeroVM and ZeroCloud to optimize MapReduce for data locality, load balancing, and skew handling.
Machine Learning At Speed: Operationalizing ML For Real-Time Data StreamsLightbend
Audience: Architects, Data Scientists, Developers
Technical level: Introductory
From home intrusion detection, to self-driving cars, to keeping data center operations healthy, Machine Learning (ML) has become one of the hottest topics in software engineering today. While much of the focus has been on the actual creation of the algorithms used in ML, the less talked-about challenge is how to serve these models in production, often utilizing real-time streaming data.
The traditional approach to model serving is to treat the model as code, which means that ML implementation has to be continually adapted for model serving. As the amount of machine learning tools and techniques grows, the efficiency of such an approach is becoming more questionable. Additionally, machine learning and model serving are driven by very different quality of service requirements; while machine learning is typically batch, dealing with scalability and processing power, model serving is mostly concerned with performance and stability.
In this webinar with O’Reilly author and Lightbend Principal Architect, Boris Lublinsky, we will define an alternative approach to model serving, based on treating the model itself as data. Using popular frameworks like Akka Streams and Apache Flink, Boris will review how to implement this approach, explaining how it can help you:
* Achieve complete decoupling between the model implementation for machine learning and model serving, enforcing better standardization of your model serving implementation.
* Enable dynamic updates of the served model without having to restart the system.
* Utilize Tensorflow and PMML as model representation and their usage for building “real time updatable” model serving architecture.
This calculator has been developed by me. It gives high precision results which
Normal calculator can not give. It is helpful in calculations for Space technology,
Supercomputers, Nano technology etc. I can give this calculator to interested people.
Operationalizing Machine Learning: Serving ML ModelsLightbend
Join O’Reilly author and Lightbend Principal Architect, Boris Lublinsky, as he discusses one of the hottest topics in software engineering today: serving machine learning models.
Typically with machine learning, different groups are responsible for model training and model serving. Data scientists often introduce their own machine-learning tools, causing software engineers to create complementary model-serving frameworks to keep pace. It’s not a very efficient system. In this webinar, Boris demonstrates a more standardized approach to model serving and model scoring:
* How to develop an architecture for serving models in real time as part of input stream processing
* How this approach enables data science teams to update models without restarting existing applications
* Different ways to build this model-scoring solution, using several popular stream processing engines and frameworks
The document provides an overview of SystemC and describes a sample program to illustrate key concepts. The example program models two modules that exchange Fibonacci number data through a bus. Each module contains two internal modules for processing and saving the numbers. One module uses an SC_METHOD thread, while the other uses an SC_THREAD. The modules communicate data through ports, channels and an interface to synchronize their operation controlled by a clock event. This demonstrates SystemC concepts like modules, channels, ports, interfaces, events and thread types for modeling concurrent hardware systems.
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...inside-BigData.com
In this deck from PASC18, Robert Searles from the University of Delaware presents: Abstractions and Directives for Adapting Wavefront Algorithms to Future Architectures.
"Architectures are rapidly evolving, and exascale machines are expected to offer billion-way concurrency. We need to rethink algorithms, languages and programming models among other components in order to migrate large scale applications and explore parallelism on these machines. Although directive-based programming models allow programmers to worry less about programming and more about science, expressing complex parallel patterns in these models can be a daunting task especially when the goal is to match the performance that the hardware platforms can offer. One such pattern is wavefront. This paper extensively studies a wavefront-based miniapplication for Denovo, a production code for nuclear reactor modeling.
We parallelize the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm in the main kernel of Minisweep (the miniapplication) using CUDA, OpenMP and OpenACC. Our OpenACC implementation running on NVIDIA's next-generation Volta GPU boasts an 85.06x speedup over serial code, which is larger than CUDA's 83.72x speedup over the same serial implementation. Our experimental platform includes SummitDev, an ORNL representative architecture of the upcoming Summit supercomputer. Our parallelization effort across platforms also motivated us to define an abstract parallelism model that is architecture independent, with a goal of creating software abstractions that can be used by applications employing the wavefront sweep motif."
Watch the video: https://wp.me/p3RLHQ-iPU
Read the Full Paper: https://doi.org/10.1145/3218176.3218228
and
https://pasc18.pasc-conference.org/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Python bindings for SAF-AIS APIs offer many advantages to middleware developers, application developers, tool developers and testers. The bindings help to speed up the software development lifecycle and enable rapid deployment of architecture-independent components and services. This session will describe main principles guiding Python bindings implementation, and will have extensive in-depth application Python code examples using SAF-AIS services.
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersKumari Surabhi
It introduces the performance analysis of OpenStack Cloud with the commodity computers in the big data environments. It concludes that the data storage and analysis in hadoop cluster in cloud is more flexible and easily scalable than the real system cluster. It also concludes the cluster in commodities computers are faster than the cloud clusters.
The document discusses object oriented programming and Java. It provides a history of Java, describing how it was created at Sun Microsystems in the 1990s to be a simpler alternative to C++ that was architecture neutral, portable, distributed and secure. It then summarizes Java's key features including being object oriented, robust, simple, secure, portable and interpreted. It also describes Java's basic data types and how variables are declared and initialized in Java.
In this lecture, I have introduced to Massive Open Online courses. How they are conducted, how xMoocs are different from cMoocs. Also, included list of platforms which are hosting MOOC courses. Also, listed more than 1700 courses along with top 10 MOOC courses of 2017
This PPT discusses about some programming puzzles that are related to Encryption and also it emphasis the need for strengthening bit-wise operators concept.
This is delivered yesterday in our college to enlighten 1st year ECE and EEE students about engineering, engineering principles, how to be a good engineering students, and finally how to grow as a enterpreneur.
Engineering is the application of scientific knowledge and mathematics to solve problems and design solutions that improve lives and benefit society. It involves using principles from various scientific fields like physics, chemistry, biology combined with design, business and other considerations to invent, innovate, build and maintain useful structures, machines, processes and systems. Some key aspects of engineering include identifying societal needs, designing and testing solutions, and producing things in a cost-effective manner to address those needs.
In this talk, I explain 21st century predictions, 21st century challenges, how other nations are readying to face. IOT, sensor developments are instrumental for eScience or data science.
The document discusses the status of technical education in India and proposes ways to improve it for the 21st century. It notes that over 50% of engineering college faculty are recent graduates who see teaching as temporary until they find industry jobs. This results in a lack of experienced teachers and motivation for students. It suggests universities provide refresher courses and teaching methodology training to help recent graduate faculty before they teach. Engaging recent graduates as teaching assistants first could also better prepare them to teach on their own.
This document discusses the need to improve culture at AITAM. It questions who the students, teachers, parents, and management currently are, and what characteristics they should have. The speaker notes issues like students not understanding basic math, going to toilets as a group, and teachers solely focusing on JNTU syllabus. It provides characteristics of great teachers as being prepared, engaging students, caring about them, and communicating with parents. Overall, the speaker calls for stakeholders to work on defining, creating, and continuing a better culture at AITAM.
This presentation is used in a refresher course at Nuzvid. This is one day session of the course. It introduces research avenues in Image Processing and allied areas to faculty participants..
This document provides an overview of digital image processing (DIP) and discusses various topics related to it. It begins with welcoming remarks and introductions. It then discusses key areas of application for image processing like optical character recognition, security, compression, and medical imaging. Some main techniques covered include image acquisition, pre-processing, enhancement, segmentation, feature extraction, classification, and understanding. Application areas like remote sensing, astronomy, security, and OCR are also summarized. The document provides examples and illustrations of different image processing concepts.
This presentation is used in many places including Ambedkar Institute of Technology, Banglore. I was engaging more than 60 faculty members for 5 full days. Both tutorials and hands on training. This presentation explains Unix Internals, Socket programming, both data gram based and IP based concepts are explained with live examples.
This presentation is used in many places including Vignan, AITAM, This contains both tutorials and hands on training. This presentation explains Unix Internals, Socket programming, both data gram based and IP based concepts are explained with live examples.
This talk is developed to address a refresher course at Yanam for one full day. I have introduced the audience to clustering, both hierarchical and non-hierarchical. Clustering methods such as K-Means, K-Mediods, etc all introduced with live demonstrations.
This document provides an overview of writing OpenMP programs on multi-core machines. It discusses:
1) Why OpenMP is useful for parallel programming and its main components like compiler directives and library routines.
2) Elements of OpenMP like parallel regions, work sharing constructs, data scoping, and synchronization methods.
3) Achieving scalable speedup through techniques like breaking data dependencies, avoiding synchronization overheads, and improving data locality with cache and page placement.
This talk is given at Vizianagaram where many Engineering college faculty were attended. I have introduced developments in multi-core computers along with their architectural developments. Also, I have explained about high performance computing, where these are used. I have introduced the concept of pipelining, Amdahl's law, issues related to pipelining, MIPS architecture.
I have introduced developments in multi-core computers along with their architectural developments. Also, I have explained about high performance computing, where these are used. At the end, openMP is introduced with many ready to run parallel programs.
In this talk, I have explained about feature selection, extraction with emphasis to image processing. Methods such as Principal Component Analysis, Canonical ANalysis are explained with numerical examples.
Aim of this talk is to highlight the importance of statistics in the 21st century because of the availability of variety of sensors, MEMS, Nano-Sensors, E-sensors, IOT.
its all about Artificial Intelligence(Ai) and Machine Learning and not on advanced level you can study before the exam or can check for some information on Ai for project
In tube drawing process, a tube is pulled out through a die and a plug to reduce its diameter and thickness as per the requirement. Dimensional accuracy of cold drawn tubes plays a vital role in the further quality of end products and controlling rejection in manufacturing processes of these end products. Springback phenomenon is the elastic strain recovery after removal of forming loads, causes geometrical inaccuracies in drawn tubes. Further, this leads to difficulty in achieving close dimensional tolerances. In the present work springback of EN 8 D tube material is studied for various cold drawing parameters. The process parameters in this work include die semi-angle, land width and drawing speed. The experimentation is done using Taguchi’s L36 orthogonal array, and then optimization is done in data analysis software Minitab 17. The results of ANOVA shows that 15 degrees die semi-angle,5 mm land width and 6 m/min drawing speed yields least springback. Furthermore, optimization algorithms named Particle Swarm Optimization (PSO), Simulated Annealing (SA) and Genetic Algorithm (GA) are applied which shows that 15 degrees die semi-angle, 10 mm land width and 8 m/min drawing speed results in minimal springback with almost 10.5 % improvement. Finally, the results of experimentation are validated with Finite Element Analysis technique using ANSYS.
Sorting Order and Stability in Sorting.
Concept of Internal and External Sorting.
Bubble Sort,
Insertion Sort,
Selection Sort,
Quick Sort and
Merge Sort,
Radix Sort, and
Shell Sort,
External Sorting, Time complexity analysis of Sorting Algorithms.
How to use nRF24L01 module with ArduinoCircuitDigest
Learn how to wirelessly transmit sensor data using nRF24L01 and Arduino Uno. A simple project demonstrating real-time communication with DHT11 and OLED display.
RICS Membership-(The Royal Institution of Chartered Surveyors).pdfMohamedAbdelkader115
Glad to be one of only 14 members inside Kuwait to hold this credential.
Please check the members inside kuwait from this link:
https://www.rics.org/networking/find-a-member.html?firstname=&lastname=&town=&country=Kuwait&member_grade=(AssocRICS)&expert_witness=&accrediation=&page=1
Fluid mechanics is the branch of physics concerned with the mechanics of fluids (liquids, gases, and plasmas) and the forces on them. Originally applied to water (hydromechanics), it found applications in a wide range of disciplines, including mechanical, aerospace, civil, chemical, and biomedical engineering, as well as geophysics, oceanography, meteorology, astrophysics, and biology.
It can be divided into fluid statics, the study of various fluids at rest, and fluid dynamics.
Fluid statics, also known as hydrostatics, is the study of fluids at rest, specifically when there's no relative motion between fluid particles. It focuses on the conditions under which fluids are in stable equilibrium and doesn't involve fluid motion.
Fluid kinematics is the branch of fluid mechanics that focuses on describing and analyzing the motion of fluids, such as liquids and gases, without considering the forces that cause the motion. It deals with the geometrical and temporal aspects of fluid flow, including velocity and acceleration. Fluid dynamics, on the other hand, considers the forces acting on the fluid.
Fluid dynamics is the study of the effect of forces on fluid motion. It is a branch of continuum mechanics, a subject which models matter without using the information that it is made out of atoms; that is, it models matter from a macroscopic viewpoint rather than from microscopic.
Fluid mechanics, especially fluid dynamics, is an active field of research, typically mathematically complex. Many problems are partly or wholly unsolved and are best addressed by numerical methods, typically using computers. A modern discipline, called computational fluid dynamics (CFD), is devoted to this approach. Particle image velocimetry, an experimental method for visualizing and analyzing fluid flow, also takes advantage of the highly visual nature of fluid flow.
Fundamentally, every fluid mechanical system is assumed to obey the basic laws :
Conservation of mass
Conservation of energy
Conservation of momentum
The continuum assumption
For example, the assumption that mass is conserved means that for any fixed control volume (for example, a spherical volume)—enclosed by a control surface—the rate of change of the mass contained in that volume is equal to the rate at which mass is passing through the surface from outside to inside, minus the rate at which mass is passing from inside to outside. This can be expressed as an equation in integral form over the control volume.
The continuum assumption is an idealization of continuum mechanics under which fluids can be treated as continuous, even though, on a microscopic scale, they are composed of molecules. Under the continuum assumption, macroscopic (observed/measurable) properties such as density, pressure, temperature, and bulk velocity are taken to be well-defined at "infinitesimal" volume elements—small in comparison to the characteristic length scale of the system, but large in comparison to molecular length scale
6. Being an old CSE teacher, I used to ask
students how Computer is used during
1970s and 1980s. Of course, today it is
used mostly for entertainment. Any
one would like to share?.
7. In the same sense, how really the High
Performance Computing is becoming
necessary today and coming years.
An example, Weka.
10. The Scientific Computing Campaign
• Swift addresses most of these components
10
THINK
about what
to run next
RUN a
battery
of tasks
COLLECT
results
IMPROVE
methods
and
codes
13. Swift
13
Swift is a data-flow oriented coarse grained scripting
language that supports dataset typing and mapping,
dataset iteration, conditional branching, and procedural
composition.
Swift programs (or workflows) are written in a language
called Swift.
Swift scripts are primarily concerned with processing
(possibly large) collections of data files, by invoking
programs to do that processing. Swift handles execution
of such programs on remote sites by choosing sites,
handling the staging of input and output files to and from
the chosen sites and remote execution of programs.
14. 14
• Swift is a parallel scripting language for multi-cores, clusters,
grids, clouds, and supercomputers
– for loosely-coupled applications - application linked by exchanging files
– debug on a laptop, then run on a Cray
• Swift scripts are easy to write
– simple high-level functional language with C-like syntax
– small Swift scripts can do large-scale work
• Swift is easy to run: contain all services for running workflow - in
one Java application
– simple command line interface
• Swift is fast: based on an efficient execution engine
– scales readily to millions of tasks
• Swift usage is growing:
– applications in neuroscience, proteomics, molecular dynamics,
biochemistry, economics, statistics, earth systems science, and more.
15. 15
Clouds: Magellan,
Amazon,
FutureGrid,
BioNimbus
Challenge:
complex variety of
computing resources
• Parallel distributed computing is
HARD
• Swift harnesses diverse resources
with simple scripts
• Many applications are well suited to
this approach
• Motivates collaboration through
libraries of pipelines and sharing of
data and provenance
Challenge: Complexity of parallel computing
16. 16
When do you need Swift?
Typical application: protein-ligand docking for drug screening
2M+ ligands
(B)
O(100K)
drug
candidates
Tens of fruitful
candidates for
wetlab & APS
O(10)
proteins
implicated
in a disease
1M
compute
jobs
X …
T1af7 T1r69T1b72
17. 17
Submit host
(Laptop, Linux login node…)
Workflow
status
and logs
Java application
Compute
nodes
f1
f2
f3
a1
a2
Data server
f1 f2 f3
Provenance
log
script
App
a1
App
a2
site
list
app
list
File
transport
Swift supports clusters, grids, and supercomputers.
Solution: parallel scripting for high level parallellism
18. Goals of the Swift language
Swift was designed to handle many aspects of
the computing campaign
• Ability to integrate many application components into a new workflow
application
• Data structures for complex data organization
• Portability- separate site-specific configuration from application logic
• Logging, provenance, and
plotting features
• Today, we will focus on supporting multiple
language applications at large scale
18
THIN
K
RU
N
COLL
ECT
IMPR
OVE
19. 19
A Summary of Swift in a nutshell
•Swift scripts are text files ending in .swift The swift command runs on any host, and executes these scripts.
• swift is a Java application, which you can install almost anywhere. On Linux, just unpack
•the distribution tar file and add its bin/ directory to your PATH.
•Swift scripts run ordinary applications, just like shell scripts do.
•Swift makes it easy to run these applications on parallel and remote computers
(from laptops to supercomputers). If you can ssh to the system, Swift can likely run
applications there.
•The details of where to run applications and how to get files back and forth are
described in configuration files separate from your program. Swift speaks ssh, PBS,
Condor, SLURM, LSF, SGE, Cobalt, and Globus to run applications, and scp, http, ftp,
and GridFTP to move data.
•The Swift language has 5 main data types: boolean, int, string, float, and file.
Collections of these are dynamic, sparse arrays of arbitrary dimension and structures of
scalars and/or arrays defined by thetype declaration.
•Swift file variables are "mapped" to external files.
• Swift sends files to and from remote systems for you automatically.
•Swift variables are "single assignment": once you set them you can not change them
(in a given block of code). This makes Swift a natural, "parallel data flow" language.
This programming model keeps your workflow scripts simple and easy to write and
understand.
20. 20
A Summary of Swift in a nutshell
Swift scripts are text files ending in .swift The swift command runs on any host, and executes these scripts.
• This programming model keeps your workflow scripts simple and easy to write and
•understand.
•Swift lets you define functions to "wrap" application programs, and to cleanly
structure more complex scripts.
Swift app functions take files and parameters as inputs and return files as outputs.
•A compact set of built-in functions for string and file manipulation, type conversions,
•high level IO, etc. is provided.
•Swift’s equivalent of printf() is tracef(), with limited and slightly different format codes.
•Swift’s parallel foreach {} statement is the workhorse of the language, and executes all
•iterations of the loop concurrently. The actual number of parallel tasks executed is based
•on available resources and settable "throttles".
•Swift conceptually executes all the statements, expressions and function calls in your
• program in parallel, based on data flow. These are similarly throttled based on available
•resources and settings.
•Swift also has if and switch statements for conditional execution. These are seldom needed
•in simple workflows but they enable very dynamic workflow patterns to be specified.
21. Swift programming model:
all progress driven by concurrent
dataflow
• F() and G() implemented in native code or external programs
• F() and G()run in concurrently in different processes
• r is computed when they are both done
• This parallelism is automatic
• Works recursively throughout the program’s call graph
21
(int r) myproc (int i, int j)
{
int x = F(i);
int y = G(j);
r = x + y;
}
22. 22
Data-flow driven execution
1 (int r) myproc (int i)
2 {
3 j = f(i);
4 k = g(i);
5 r = j + k;
6 }
7 f() and g() are computed in parallel
8 myproc() returns r when they are done
9 This parallelism is AUTOMATIC
10 Works recursively down the program’s call graph
22
F() G()
i
+
j k
r
24. Swift programming model
• Data types
int i = 4;
string s = "hello world";
file image<"snapshot.jpg">;
• Shell access
app (file o) myapp(file f, int i)
{ mysim "-s" i @f @o; }
• Structured data
typedef image file;
image A[];
type protein_run {
file pdb_in; file sim_out;
}
bag<blob>[] B;
24
Conventional expressions
if (x == 3) {
y = x+2;
s = strcat("y: ", y);
}
Parallel loops
foreach f,i in A {
B[i] =
convert(A[i]);
}
Data flow
merge(analyze(B[0],
B[1]),
analyze(B[2],
B[3]));
Swift: A language for distributed parallel scripting, J. Parallel
Computing, 2011
25. Hierarchical programming model
25
Top-level dataflow script
user-workflow.swift
Distributed dataflow evaluation
Data dependency resolution
Work stealing / load balancing
SWIG-generated wrappers
C C++ Fortran
User libraries
MPI
MPI?
26. Support calls to embedded
interpreters
26
We have
plugins Tcl,
Julia, and
QtScript
27. 27
A simple Swift script: functions run programs
1 type image; // Declare a “file” type.
2
3 app (image output) rotate (image input) {
4 {
5 convert "-rotate" 180 @input @output ;
6 }
7
8 image oldimg <"orion.2008.0117.jpg">;
9 image newimg <"output.jpg">;
10
11 newimg = rotate(oldimg); // runs the “convert” app
27
28. 28
A simple Swift script: functions run programs
1 type image; // Declare a “file” type.
2
3 app (image output) rotate (image input) {
4 {
5 convert "-rotate" 180 @input @output ;
6 }
7
8 image oldimg <"orion.2008.0117.jpg">;
9 image newimg <"output.jpg">;
10
11 newimg = rotate(oldimg); // runs the “convert” app
28
“application”
wrapping function
29. 29
A simple Swift script: functions run programs
1 type image; // Declare a “file” type.
2
3 app (image output) rotate (image input) {
4 {
5 convert "-rotate" 180 @input @output ;
6 }
7
8 image oldimg <"orion.2008.0117.jpg">;
9 image newimg <"output.jpg">;
10
11 newimg = rotate(oldimg); // runs the “convert” app
29
Input fileOutput file
Actual files
to use
30. 30
A simple Swift script: functions run programs
1 type image; // Declare a “file” type.
2
3 app (image output) rotate (image input) {
4 {
5 convert "-rotate" 180 @input @output ;
6 }
7
8 image oldimg <"orion.2008.0117.jpg">;
9 image newimg <"output.jpg">;
10
11 newimg = rotate(oldimg); // runs the “convert” app
30
Invoke the “rotate”
function to run the
“convert” application
31. 31
Parallelism via foreach { }
1 type image;
2
3 (image output) flip(image input) {
4 app {
5 convert "-rotate" "180" @input @output;
6 }
7 }
8
9 image observations[ ] <simple_mapper; prefix=“orion”>;
10 image flipped[ ] <simple_mapper; prefix=“flipped”>;
11
12
13
14 foreach obs,i in observations {
15 flipped[i] = flip(obs);
16 }
31
Name outputs based on index
Process all dataset members in parallel
Map inputs from local directory
32. 32
foreach sim in [1:1000] {
(structure[sim], log[sim]) = predict(p, 100., 25.);
}
result = analyze(structure)
…
1000
Runs of the
“predict”
application
Analyze()
T1af
7
T1r6
9
T1b72
Large scale parallelization with simple loops
33. 33
Nested loops generate massive parallelism
1. Sweep( )
2. {
3. int nSim = 1000;
4. int maxRounds = 3;
5. Protein pSet[ ] <ext; exec="Protein.map">;
6. float startTemp[ ] = [ 100.0, 200.0 ];
7. float delT[ ] = [ 1.0, 1.5, 2.0, 5.0, 10.0 ];
8.
9. foreach p, pn in pSet {
10. foreach t in startTemp {
11. foreach d in delT {
12. ItFix(p, nSim, maxRounds, t, d);
13. }
14. }
15. }
16. }
17. 33
10 proteins x 1000 simulations x
3 rounds x 2 temps x 5 deltas
= 300K tasks
37. 37
SLIM: Swift sketch
type DataFile;
type ModelFile;
type FloatFile;
// Function declaration
app (DataFile[] dFiles) split(DataFile inputFile, int noOfPartition) {
}
app (ModelFile newModel, FloatFile newParameter) firstStage(DataFile data, ModelFile model) {
}
app (ModelFile newModel, FloatFile newParameter) reduce(ModelFile[] model, FloatFile[]
parameter){
}
app (ModelFile newModel) combine(ModelFile reducedModel, FloatFile reducedParameter,
ModelFile oldModel){
}
app (int terminate) check(ModelFile finalModel){
}
38. 38
SLIM: Swift sketch
// Variables to hold the input data
DataFile inputData <"MyData.dat">;
ModelFile earthModel <"MyModel.mdl">;
// Variables to hold the finalized models
ModelFile model[];
// Variables for reduce stage
ModelFile reducedModel[];
FloatFile reducedFloatParameter[];
// Get the number of partition from command line parameter
int n = @toint(@arg("n","1"));
model[0] = earthModel;
//Partition the input data
DataFile partitionedDataFiles[] = split(inputData, n);
39. 39
SLIM: Swift sketch
// Iterate sequentially until the terminal condition is met
iterate v {
// Variables to hold the output of the first stage
ModelFile firstStageModel[] < simple_mapper; location = "output",
prefix = @strcat(v, "_"), suffix = ".mdl" >;
file_dat firstStageFloatParameter[] < simple_mapper; padding = 3, location = "output",
prefix = @strcat(v, "_"), suffix = ".float" >;
// Parallel for loop
foreach partitionedFile, count in partitionedDataFiles {
// First stage
(firstStageModel[count], firstStageFloatParameter[count]) = firstStage(model[v] , partitionedFile);
}
// Reduce stage (use the files for synchronization)
(reducedModel[v], reducedFloatParameter[v]) = reduce(firstStageModel, firstStageFloatParameter);
// Combine stage
model[v+1] = combine(reducedModel[v] ,reducedFloatParameter[v], model[v]);
//Check the termination condition here
int shouldTerminate = check(model[v+1]);
} until (shouldTerminate != 1);
40. 40
Software stack for Swift on ALCF BG/Ps
Swift:
scripting language, task coordination,
throttling, data management, restart
Coaster execution provider:
per-node agents for fast task dispatch
across all nodes
Linux OS
complete, high performance Linux
compute node OS with full fork/exec
Swift
scripts
Shell
scripts
App
invocations
File System
41. 41
Java application
File System
f1 f2 f3
Running Swift scripts
Coaster
Service
Coaster Workers
• Start Coaster
– Start coaster service
– Start coaster workers
(qsub, ssh, …etc)
• Run Swift script
Head node
42. 42
Swift vs. MapReduce
What to compare (to maintain the level of abstraction)?
• Swift framework (language, compiler, runtime, scheduler plugin
– Coasters, storage)
• Map-reduce framework (Hive, Hadoop, HDFS)
Similarities:
• Goals: productive programming environment, hide
complexities related to parallel execution
• able to scale, hide individual failures, load balancing
42
43. 43
Swift vs. MapReduce
Swift Advantages:
• Intuitive programing model.
• Platform independent: from small clusters, to large
ones, to large ‘exotic’ architectures – BG/P, distributed
resources (grids).
- Map/reduce modes will not support all these
- Can generally use the software stack available on the target
machine with no changes
• More flexible data model (just your old files);
- No migration pain;
- Easy to share with other applications
• Optimized for coarse-granularity workflows.
- E.g., files staged in a in-memory file-system; optimizations for
specific patterns.
43
44. • Write site-independent scripts
• Automatic parallelization and data movement
• Run native code, script fragments as applications
• Rapidly subdivide large partitions for
MPI jobs
• Move work to data locations
44
Swift
control
process
Swift
control
process
Swift/T
control
process
Swift worker
process
C
C+
+
Fortr
an
C
C+
+
Fortr
an
C
C+
+
Fortr
an
MPI
Swift/T worker
64K cores of Blue Waters
2 billion Python tasks
14 million Pythons/s
Swift/T: Enabling high-performance
workflows
45. Using Swift/Python/Numpy
global const string numpy = "from numpy import *nn";
typedef matrix string;
(matrix A) eye(int n) {
command = sprintf("repr(eye(%i))", n);
code = numpy+command;
matrix t = python(code);
A = replace_all(t, "n", "", 0);
}
(matrix R) add(matrix A1, matrix A2) {
command = sprintf("repr(%s+%s)", A1, A2);
code = numpy+command;
matrix t = python(code);
R = replace_all(t, "n", "", 0);
}
45
a1 = eye(3);
a2 = eye(3);
sum = add(a1, a2);
printf("2*eye(3)=%s", sum);
46. Fully parallel evaluation of complex
scripts
46
int X = 100, Y = 100;
int A[][];
int B[];
foreach x in [0:X-1] {
foreach y in [0:Y-1] {
if (check(x, y)) {
A[x][y] = g(f(x), f(y));
} else {
A[x][y] = 0;
}
}
B[x] = sum(A[x]);
}
47. Centralized evaluation can be a bottleneck
at extreme scales
47
Had this (Swift/K): For extreme scale, we need (Swift/T):
48. MPI: The Message Passing
Interface• Programming model used on large supercomputers
• Can run on many networks, including sockets, or
shared memory
• Standard API for C and Fortran, other languages have
working implementations
• Contains communication calls for
– Point-to-point (send/recv)
– Collectives (broadcast, reduce, etc.)
• Interesting concepts
– Communicators: collections of
communicating processing and
a context
– Data types: Language-independent
data marshaling scheme
48
49. ADLB: Asynchronous Dynamic Load
Balancer• An MPI library for master-worker
workloads in C
• Uses a variable-size, scalable
network of servers
• Servers implement
work-stealing
• The work unit is a byte array
• Optional work priorities, targets, types
• For Swift/T, we added:
– Server-stored data
– Data-dependent execution
– Tcl bindings!
49
Serv
ers
Workers
• Lusk et al. More scalability, less pain:
A simple programming model and its
implementation for extreme
50. One Swift server core per
node:
16node
job
Flexible placement of server ranks in
a Swift/T job
Four Swift nodes for the job:
16node
job
51. A[3] = g(A[2]);
Example distributed execution
• Code
• Evaluate dataflow operations
• Workers: execute tasks
51
A[2] = f(getenv(“N”));
• Perform getenv()
• Submit f
• Process f
• Store A[2]
• Subscribe to
A[2]
• Submit g
• Process g
• Store A[3]
Task put Task put
• Wozniak et al. Turbine: A distributed-memory dataflow
engine for high performance many-task
applications. Fundamenta Informaticae 128(3), 2013
Task get Task get
52. Swift/T-specific features
• Task locality: Ability to send a task to a process
– Allows for big data –type applications
– Allows for stateful objects to remain resident in
the workflow
– location L = find_data(D);
int y = @location=L f(D, x);
• Data broadcast
• Task priorities: Ability to set task priority
– Useful for tweaking load balancing
• Updateable variables
– Allow data to be modified after its initial write
– Consumer tasks may receive original or updated values when they emerge from the
work queue 52
53. Swift/T: scaling of trivial foreach { } loop 100
microsecond to 10 millisecond tasks
on up to 512K integer cores of Blue Waters
53
55. Swift/T Compiler and Runtime
• STC translates high-level Swift
expressions into low-level
Turbine operations:
55
– Create/Store/Retrieve typed data
– Manage arrays
– Manage data-dependent tasks
56. x = g();
if (x > 0) {
n = f(x);
foreach i in [0:n-1] {
output(p(i));
}}
Swift code in dataflow
• Dataflow definitions create nodes in the dataflow graph
• Dataflow assignments create edges
• In typical (DAG) workflow languages, this forms a static graph
• In Swift, the graph can grow dynamically – code fragments are evaluated
(conditionally) as a result of dataflow
• In its early implementation, these fragments were just tasks
56
x = g();
x
n
foreach i … {
output(p(i));
if (x > 0) {
n = f(x); …
60. Parallel tasks in Swift/T
• Swift expression: z = @par=32 f(x,y);
• ADLB server finds 8 available workers
– Workers receive ranks from ADLB server
– Performs comm = MPI_Comm_create_group()
• Workers perform f(x,y)communicating on comm
61. LAMMPS parallel tasks
• LAMMPS provides a
convenient C++ API
• Easily used by Swift/T
parallel tasks
foreach i in [0:20] {
t = 300+i;
sed_command = sprintf("s/_TEMPERATURE_/%i/g", t);
lammps_file_name = sprintf("input-%i.inp", t);
lammps_args = "-i " + lammps_file_name;
file lammps_input<lammps_file_name> =
sed(filter, sed_command) =>
@par=8 lammps(lammps_args);
}
Tasks with varying sizes packed into big MPI run
Black: Compute Blue: Message White: Idle
62. GeMTC: GPU-enabled Many-Task Computing
Goals:
1) MTC support 2) Programmability
3) Efficiency 4) MPMD on SIMD
5) Increase concurrency to warp level
Approach:
Design & implement GeMTC middleware:
1) Manages GPU 2) Spread host/device
3) Workflow system integration (Swift/T)
Motivation: Support for MTC on all accelerators!
63. Logging and debugging in Swift
• Traditionally, Swift programs are debugged through the log or
the TUI (text user interface)
• Logs were produced using normal methods, containing:
– Variable names and values as set with respect to thread
– Calls to Swift functions
– Calls to application code
• A restart log could be produced to restart a large Swift run
after certain fault conditions
• Methods require single Swift site: do not scale to larger runs
63
64. Logging in MPI• The Message Passing Environment (MPE)
• Common approach to logging MPI programs
• Can log MPI calls or application events – can store arbitrary data
• Can visualize log with Jumpshot
• Partial logs are stored at the site of
each process
– Written as necessary to shared
file system
• in large blocks
• in parallel
– Results are merged into a big log file
(CLOG, SLOG)
• Work has been done optimize the
file format for various queries
64
65. Swift for Really Parallel BuildsApps
app (object_file o) gcc(c_file c, string cflags[]) {
// Example:
// gcc -c -O2 -o f.o f.c
"gcc" "-c" cflags "-o" o c;
}
app (x_file x) ld(object_file o[], string ldflags[]) {
// Example:
// gcc -o f.x f1.o f2.o ...
"gcc" ldflags "-o" x o;
}
app (output_file o) run(x_file x) {
"sh" "-c" x @stdout=o;
}
app (timing_file t) extract(output_file o) {
"tail" "-1" o "|" "cut" "-f" "2" "-d" " " @stdout=t;
}
Swift code
string program_name = "programs/program1.c";
c_file c = input(program_name);
// For each
foreach O_level in [0:3] {
make file names…
// Construct compiler flags
string O_flag = sprintf("-O%i", O_level);
string cflags[] = [ "-fPIC", O_flag ];
object_file o<my_object> = gcc(c, cflags);
object_file objects[] = [ o ];
string ldflags[] = [];
// Link the program
x_file x<my_executable> = ld(objects, ldflags);
// Run the program
output_file out<my_output> = run(x);
// Extract the run time from the program output
timing_file t<my_time> = extract(out);
65
66. Abstract, extensible MapReduce in
Swiftmain {
file d[];
int N = string2int(argv("N"));
// Map phase
foreach i in [0:N-1] {
file a = find_file(i);
d[i] = map_function(a);
}
// Reduce phase
file final <"final.data"> = merge(d, 0, tasks-1);
}
(file o) merge(file d[], int start, int stop) {
if (stop-start == 1) {
// Base case: merge pair
o = merge_pair(d[start], d[stop]);
} else {
// Merge pair of recursive calls
n = stop-start;
s = n % 2;
o = merge_pair(merge(d, start, start+s),
merge(d, start+s+1, stop));
}}
66
• User needs to implement
map_function() and merg
• These may be implemented
in native code, Python, etc.
• Could add annotations
• Could add additional custom
application logic
67. Crystal Coordinate Transformation
Workflow
• Goal: Transform 3D image stack from detector coordinates to
real space coordinates
– Operate on up to 50 GB data
– Relatively light processing – I/O rates are critical
– Core numerics in C++, Qt
– Parallelism via Swift
• Challenges
– Coupling C++ to Swift via Qscript
– Data access to HDF file on distributed storage
67
69. CCTW: Swift/T application (C++)
bag<blob> M[];
foreach i in [1:n] {
blob b1= cctw_input(“pznpt.nxs”);
blob b2[];
int outputId[];
(outputId, b2) = cctw_transform(i, b1);
foreach b, j in b2 {
int slot = outputId[j];
M[slot] += b;
}}
foreach g in M {
blob b = cctw_merge(g);
cctw_write(b);
}}
69
70. Stateful external interpreters• Desire to use high-level, 3rd party algorithms in Python, R to orchestrate
Swift workflows, e.g.:
– Python DEAP for evolutionary algorithms
– R language GA package
• Typical control pattern:
– GA minimizes the cost function
– You pass the cost function to the library and wait
• We want Swift to obtain the parameters from the library
– We launch a stateful interpreter on a thread
– The "cost function" is a dummy that returns the
parameters to Swift over IPC
– Swift passes the real cost function results back
to the library over IPC
• Achieve high productivity and high scalability
– Library is not modified – unaware of framework!
– Application logic extensions in high-level script
Load
balancingSwift
worker
Python/R
IPC GA
MPI
Process
Task
s
Resul
ts
MP
I
73. Native code overview
• Turbine is a Tcl-based system
• It is easy to call to Tcl from Swift/T
• Thus, we simply need to wrap the native function in a Tcl wrapper
• You can do this manually or with SWIG
1. Build the native function as a shared library
2. Wrap the native function in Tcl with SWIG
3. Build a Tcl package
4. Call to the Tcl function from Swift
5. Run it- just make sure the package can be found at run time
73
74. Simple app
type file;
app (file out) echo_app (string s){
echo s stdout=filename(out);
}
file out <"out.txt">;
out = echo_app("Hello world!");
74
75. Foreach example
75
type file;
app (file o) simulate_app (){
simulate stdout=filename(o);}
foreach i in [1:10] {
string fname=strcat("output/sim_", i, ".out");
file f <single_file_mapper; file=fname>;
f = simulate_app();
}
76. Multiple Apps
76
type file;
app (file o) simulate_app (int time){
simulate "-t" time stdout=filename(o);
}
app (file o) stats_app (file s[]){
stats filenames(s) stdout=filename(o);
}
file sims[];
int time = 5;
int nsims = 10;
foreach i in [1:nsims] {
string fname = strcat("output/sim_",i,".out");
file simout <single_file_mapper; file=fname>;
simout = simulate_app(time); sims[i] = simout;
}
file average <"output/average.out">;
average = stats_app(sims);
78. Multi stage
78
# Values that shape the runint nsim = 10;
# number of simulation programs to runint steps = 1;
# number of timesteps (seconds) per simulation
int range = 100;
# range of the generated random numbersint values = 10;
# number of values generated per simulation
# Main script and data
tracef("n*** Script parameters: nsim=%i range=%i num
values=%inn", nsim, range, values);
# Dynamically generated bias for simulation ensemblefile
seedfile<"output/seed.dat">;
seedfile = genseed_app(1);
int seedval = readData(seedfile);
tracef("Generated seed=%in", seedval);
file sims[];