- Ironic is an OpenStack service and Nova driver that provides bare metal provisioning capabilities, allowing deployment of images to physical machines without using virtualization.
- It has a distributed architecture with components like ironic-api, ironic-conductor, and deploy ramdisk for provisioning hardware.
- Ironic supports various technologies for remote access and booting including PXE, IPMI, and iSCSI. Configuration options allow it to work with both homogeneous and heterogeneous hardware environments.
This document provides a guide for developing distributed applications that use ZooKeeper. It discusses ZooKeeper's data model including znodes, ephemeral nodes, and sequence nodes. It describes ZooKeeper sessions, watches, consistency guarantees, and available bindings. It provides an overview of common ZooKeeper operations like connecting, reads, writes, and handling watches. It also discusses program structure, common problems, and troubleshooting. The guide is intended to help developers understand key ZooKeeper concepts and how to integrate ZooKeeper coordination services into their distributed applications.
Ironic is a modern open-source tool for hardware provisioning. Combining a RESTful API, a scale-out control plane, and pluggable hardware drivers for both in- and out-of-band management, Ironic installs operating systems in a fast, efficient, and reliable fashion.
In fact, Ironic does not “install” an operating system in the traditional sense – it doesn’t use a kickstart/preseed file or an ISO image. Instead, compressed machine images are copied onto each host, and a minimal configuration (IP, host name, SSH keys) is applied at first boot. This guarantees the consistency of the initial state of each machine in a way that traditional installers do not. Bonus: it’s also faster!
With a vibrant community of developers from the most popular server hardware vendors, Ironic’s support for many of the latest and greatest management technologies is coming directly from the creators of these technologies. Meanwhile, the project’s leaders work to create a common abstraction layer that provides a consistent experience across all supported hardware. But Ironic is still a young project – it was only started in 2013 – and there is much on the roadmap.
In this session, Devananda will demonstrate how to install Ironic with Ansible, modify a cloud image for bare metal, and deploy it to a server. He will discuss the history and architecture of the project, and its current goals and challenges. Attendees should be familiar with the task of hardware provisioning and standards like PXE and IPMI, but do not need deep knowledge of related tools.
An Updated Performance Comparison of Virtual Machines and Linux ContainersKento Aoyama
The document compares the performance of virtual machines (KVM) and Linux containers (Docker) by running benchmarks that test CPU, memory, network, and file I/O performance. It finds that Docker containers perform comparably to native Linux for most benchmarks, while KVM virtual machines have higher overhead and perform worse than Docker containers or native Linux for several tests, especially those involving CPU, random memory access, and file I/O. The study provides a useful comparison of the performance of these two virtualization technologies.
Learn from the dozens of large-scale deployments how to get the most out of your Kubernetes environment:
- Container images optimization
- Organizing namespaces
- Readiness and Liveness probes
- Resource requests and limits
- Failing with grace
- Mapping external services
- Upgrading clusters with zero downtime
This document discusses how to port Erlang and OTP to run on OSv without forking or executing external processes. Erlang ports allow communication with external processes but rely on forking and executing the port executable. As OSv does not support forking or execution, an alternative approach for Erlang ports is needed. Suggested approaches include using linked-in drivers written as shared objects, NIFs, or a custom in-process protocol to communicate with external processes without forking.
ZooKeeper is an open-source coordination service for distributed applications that provides common services like leader election, configuration management, and locks in a simple interface to help distributed processes coordinate actions and share information. It provides guarantees around consistency, reliability, and timeliness to applications using its hierarchical data model and APIs. Popular distributed systems like Hadoop and Kafka use ZooKeeper for tasks such as cluster management, metadata storage, and detecting node failures.
RENCI User Group Meeting 2017 - I Upgraded iRODS and I still have all my hairJohn Constable
The document summarizes the experience of upgrading a large iRODS installation from version 3.3.1 to 4.1.8 over the course of a year. Key aspects included developing unit tests using BATS, working closely with RENCI on bug fixes, and dealing with issues around large file uploads and full resources during the multi-stage upgrade process. Lessons learned included the importance of testing, configuration management, and working with users and the iRODS community.
Apache Mesos is the first open source cluster manager that handles the workload efficiently in a distributed environment through dynamic resource sharing and isolation.
Docker allows building and running applications inside lightweight containers. Some key benefits of Docker include:
- Portability - Dockerized applications are completely portable and can run on any infrastructure from development machines to production servers.
- Consistency - Docker ensures that application dependencies and environments are always the same, regardless of where the application is run.
- Efficiency - Docker containers are lightweight since they don't need virtualization layers like VMs. This allows for higher density and more efficient use of resources.
Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Clusterbyonggon chun
Introduce the container runtime environment which is set up with Kubernetes and various CRI runtimes(Docker, Containerd, CRI-O) and the method of NUMA-aware resource management(CPU Manager, Topology Manager, Etc) for CNF(Containerized Network Function) within Kubernetes and related issues.
Centralized Application Configuration with Spring and Apache ZookeeperRyan Gardner
From talk given at Spring One 2gx Dallas, 2014
Application configuration is an evolution. It starts as a hard-coded strings in your application and hopefully progresses to something external, such as a file or system property that can be changed without deployment. But what happens when other enterprise concerns enter the mix, such as audit requirements or access control around who can make changes? How do you maintain the consistency of values across too many application servers to manage at one time from a terminal window? The next step in the application configuration evolution is centralized configuration that can be accessed by your applications as they move through your various environments on their way to production. Such a service transfers the ownership of configuration from the last developer who touched the code to a well-versed application owner who is responsible for the configuration of the application across all environments. At Dealer.com, we have created one such solution that relies on Apache ZooKeeper to handle the storage and coordination of the configuration data and Spring to handle to the retrieval, creation and registration of configured objects in each application. The end result is a transparent framework that provides the same configured objects that could have been created using a Spring configuration, configuration file and property value wiring. This talk will cover both the why and how of our solution, with a focus on how we leveraged the powerful attributes of both Apache ZooKeeper and Spring to rid our application of local configuration files and provide a consistent mechanism for application configuration in our enterprise.
This document discusses zero-configuration provisioning of Kubernetes clusters on unmanaged infrastructure. It describes using immutable bootstrapping to provision operating systems and install Docker and Kubernetes (using Kubeadm) across nodes without requiring centralized orchestration or SSH access. The document also discusses potential future directions for the Kubernetes community regarding node admission controls and dynamic Kubelet configuration to further reduce external configuration requirements during cluster provisioning.
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark ClustersYahoo Developer Network
Deep learning is a critical capability for gaining intelligence from datasets. Many existing frameworks require a separated cluster for deep learning, and multiple programs have to be created for a typical machine learning pipeline. The separated clusters require large datasets to be transferred between clusters, and introduce unwanted system complexity and latency for end-to-end learning.
Yahoo introduced CaffeOnSpark to alleviate those pain points and bring deep learning onto Hadoop and Spark clusters. By combining salient features from deep learning framework Caffe and big-data framework Apache Spark, CaffeOnSpark enables distributed deep learning on a cluster of GPU and CPU servers. The framework is complementary to non-deep learning libraries MLlib and Spark SQL, and its data-frame style API provides Spark applications with an easy mechanism to invoke deep learning over distributed datasets. Its server-to-server direct communication (Ethernet or InfiniBand) achieves faster learning and eliminates scalability bottleneck.
Recently, we have released CaffeOnSpark at github.com/yahoo/CaffeOnSpark under Apache 2.0 License. In this talk, we will provide a technical overview of CaffeOnSpark, its API and deployment on a private cloud or public cloud (AWS EC2). A demo of IPython notebook will also be given to demonstrate how CaffeOnSpark will work with other Spark packages (ex. MLlib).
Speakers:
Andy Feng is a VP Architecture at Yahoo, leading the architecture and design of big data and machine learning initiatives. He has architected major platforms for personalization, ads serving, NoSQL, and cloud infrastructure.
Jun Shi is a Principal Engineer at Yahoo who specializes in machine learning platforms and large-scale machine learning algorithms. Prior to Yahoo, he was designing wireless communication chips at Broadcom, Qualcomm and Intel.
Mridul Jain is Senior Principal at Yahoo, focusing on machine learning and big data platforms (especially realtime processing). He has worked on trending algorithms for search, unstructured content extraction, realtime processing for central monitoring platform, and is the co-author of Pig on Storm.
Spark adds some abstractions and generalizations and performance optimizations to achieve much better efficiency especially in iterative workloads. Yet, spark does not concern itself with being a data file system while Hadoop has what is called HDFS.
Spark can leverage existing distributed files systems (like HDFS), a distributed data base (like HBase), traditional databases through its JDBC or ODBC adaptors, and flat files in local file systems or on a file store like S3 in Amazon cloud.
Hadoop MapReduce framework is similar to Spark in that it uses master slave-like paradigm. It has one Master node (which consists of a job tracker, name node, and RAM) and Worker Nodes (each worker node consists of a task tracker, data node, and a RAM). The task tracker in a worker node is analogues to an executor in Spark environment.
This document discusses ZooKeeper, an open-source server that enables distributed coordination. It provides instructions for installing ZooKeeper, describes ZooKeeper's data tree and API, and exercises for interacting with ZooKeeper including creating znodes, using watches, and setting up an ensemble across multiple servers.
Open Liberty is an open source lightweight Java runtime optimized for cloud-native applications. It provides a small footprint, high performance runtime with only the necessary APIs and features loaded. This allows for fast startup times, low memory usage, and continuous delivery. Open Liberty is optimized for containers and Kubernetes and provides zero migration between versions and platforms.
.NET Core, ASP.NET Core Course, Session 4Amin Mesbahi
Session 4,
What is Garbage Collector?
Fundamentals of memory
Conditions for a garbage collection
Generations
Configuring garbage collection
Workstation
Server
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
Apache Hadoop YARN is a modern resource-management platform that handles resource scheduling, isolation and multi-tenancy for a variety of data processing engines that can co-exist and share a single data-center in a cost-effective manner.
In the first half of the talk, we are going to give a brief look into some of the big efforts cooking in the Apache Hadoop YARN community.
We will then dig deeper into one of the efforts - supporting Docker runtime in YARN. Docker is an application container engine that enables developers and sysadmins to build, deploy and run containerized applications. In this half, we'll discuss container runtimes in YARN, with a focus on using the DockerContainerRuntime to run various docker applications under YARN. Support for container runtimes (including the docker container runtime) was recently added to the Linux Container Executor (YARN-3611 and its sub-tasks). We’ll walk through various aspects of running docker containers under YARN - resource isolation, some security aspects (for example container capabilities, privileged containers, user namespaces) and other work in progress features like image localization and support for different networking modes.
Speakers:
Vinod Kumar Vavilapalli is the Hadoop YARN and MapReduce guy at Hortonworks. He is a long term Hadoop contributor at Apache, Hadoop committer and a member of the Apache Hadoop PMC. He has a Bachelors degree from Indian Institute of Technology Roorkee in Computer Science and Engineering. He has been working on Hadoop for nearly 9 years and he still has fun doing it. Straight out of college, he joined the Hadoop team at Yahoo! Bangalore, before Hortonworks happened. He is passionate about using computers to change the world for better, bit by bit.
Sidharta Seethana is a software engineer at Hortonworks. He works on the YARN team, focussing on bringing new kinds of workloads to YARN. Prior to joining Hortonworks, Sidharta spent 10 years at Yahoo! Inc., working on a variety of large scale distributed systems for core platforms/web services, search and marketplace properties, developer network and personalization.
High Performance Computing - Cloud Point of Viewaragozin
This document discusses high performance computing in the cloud. It covers different types of workloads like I/O bound, CPU bound, and latency bound tasks. It also discusses handling task streams and structured batch jobs in the cloud. It proposes using techniques like worker pools, task queues, routing overlays, and task stealing for scheduling tasks. It discusses challenges around distributing large data sets across cloud resources and proposes solutions like caching data in memory grids. Finally, it argues that frameworks like Hadoop are not well suited for the cloud and proposes cloud-friendly alternatives like Peregrine and Spark.
This document discusses how to use Docker to containerize and deploy Python web applications. It provides steps to install Docker, build a sample Flask application into a Docker image, run the container locally, and deploy the containerized application to AWS. Key points covered include using Dockerfiles to create images, the Docker index for sharing images, and port mapping when running containers.
What is Apache Mesos and how to use it. A short introduction to distributed fault-tolerant systems with using ZooKeeper and Mesos. #installfest Prague 2014
ZooKeeper is a highly available, scalable, distributed configuration, consensus, group membership, leader election, naming and coordination service. It provides a hierarchical namespace and basic operations like create, delete, and read data. It is useful for building distributed applications and services like queues. Future releases will focus on monitoring improvements, read-only mode, and failure detection models. The community is working on features like children for ephemeral nodes and viewing session information.
Beyond x86: Managing Multi-platform Environments with OpenStackPhil Estes
A talk by Shaun Murakami and Phil Estes at the OpenStack Summit Paris, Fall 2014. We look at real-world scenarios deploying and managing workloads in a multi-platform environment of compute architectures including IBM System z (traditional mainframe), POWER, and Intel architectures. Moving beyond a homogeneous data center to a mix of enterprise architectures adds potential complexities around hypervisor support, deployment capabilities, and management of disparate workloads--of which some might be CPU-centric while others are not.
Using schedulers like Marathon and Aurora help to get your applications scheduled and executing on Mesos. In many cases it makes sense to build a framework and integrate directly. This talk will breakdown what is involved in building a framework, how to-do this with examples and why you would want to-do this. Frameworks are not only for generally available software applications (like Kafka, HDFS, Spark ,etc) but can also be used for custom internal R&D built software applications too.
Este documento introduce Pentaho Kettle, una herramienta ETL. Explica conceptos como transformaciones, pasos y trabajos. También cubre la instalación y uso del plugin OpenErp Kettle Step, el cual permite volcar datos a OpenERP de forma sencilla. Finalmente, proporciona detalles sobre características como clustering, ejecución y depuración.
The document summarizes challenges in continuing to improve single-processor performance and introduces multicore architectures as a solution. It discusses how the conventional wisdom in computer architecture has changed, noting issues like the power wall, memory wall, and limitations to extracting more instruction level parallelism (ILP). To overcome these challenges, architectures are moving to multiple cores per chip to improve parallelism and efficiency. Caches are area- and power-intensive, so multiple cores running at lower voltage and frequency can increase throughput while reducing power.
Apache Mesos is the first open source cluster manager that handles the workload efficiently in a distributed environment through dynamic resource sharing and isolation.
Docker allows building and running applications inside lightweight containers. Some key benefits of Docker include:
- Portability - Dockerized applications are completely portable and can run on any infrastructure from development machines to production servers.
- Consistency - Docker ensures that application dependencies and environments are always the same, regardless of where the application is run.
- Efficiency - Docker containers are lightweight since they don't need virtualization layers like VMs. This allows for higher density and more efficient use of resources.
Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Clusterbyonggon chun
Introduce the container runtime environment which is set up with Kubernetes and various CRI runtimes(Docker, Containerd, CRI-O) and the method of NUMA-aware resource management(CPU Manager, Topology Manager, Etc) for CNF(Containerized Network Function) within Kubernetes and related issues.
Centralized Application Configuration with Spring and Apache ZookeeperRyan Gardner
From talk given at Spring One 2gx Dallas, 2014
Application configuration is an evolution. It starts as a hard-coded strings in your application and hopefully progresses to something external, such as a file or system property that can be changed without deployment. But what happens when other enterprise concerns enter the mix, such as audit requirements or access control around who can make changes? How do you maintain the consistency of values across too many application servers to manage at one time from a terminal window? The next step in the application configuration evolution is centralized configuration that can be accessed by your applications as they move through your various environments on their way to production. Such a service transfers the ownership of configuration from the last developer who touched the code to a well-versed application owner who is responsible for the configuration of the application across all environments. At Dealer.com, we have created one such solution that relies on Apache ZooKeeper to handle the storage and coordination of the configuration data and Spring to handle to the retrieval, creation and registration of configured objects in each application. The end result is a transparent framework that provides the same configured objects that could have been created using a Spring configuration, configuration file and property value wiring. This talk will cover both the why and how of our solution, with a focus on how we leveraged the powerful attributes of both Apache ZooKeeper and Spring to rid our application of local configuration files and provide a consistent mechanism for application configuration in our enterprise.
This document discusses zero-configuration provisioning of Kubernetes clusters on unmanaged infrastructure. It describes using immutable bootstrapping to provision operating systems and install Docker and Kubernetes (using Kubeadm) across nodes without requiring centralized orchestration or SSH access. The document also discusses potential future directions for the Kubernetes community regarding node admission controls and dynamic Kubelet configuration to further reduce external configuration requirements during cluster provisioning.
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark ClustersYahoo Developer Network
Deep learning is a critical capability for gaining intelligence from datasets. Many existing frameworks require a separated cluster for deep learning, and multiple programs have to be created for a typical machine learning pipeline. The separated clusters require large datasets to be transferred between clusters, and introduce unwanted system complexity and latency for end-to-end learning.
Yahoo introduced CaffeOnSpark to alleviate those pain points and bring deep learning onto Hadoop and Spark clusters. By combining salient features from deep learning framework Caffe and big-data framework Apache Spark, CaffeOnSpark enables distributed deep learning on a cluster of GPU and CPU servers. The framework is complementary to non-deep learning libraries MLlib and Spark SQL, and its data-frame style API provides Spark applications with an easy mechanism to invoke deep learning over distributed datasets. Its server-to-server direct communication (Ethernet or InfiniBand) achieves faster learning and eliminates scalability bottleneck.
Recently, we have released CaffeOnSpark at github.com/yahoo/CaffeOnSpark under Apache 2.0 License. In this talk, we will provide a technical overview of CaffeOnSpark, its API and deployment on a private cloud or public cloud (AWS EC2). A demo of IPython notebook will also be given to demonstrate how CaffeOnSpark will work with other Spark packages (ex. MLlib).
Speakers:
Andy Feng is a VP Architecture at Yahoo, leading the architecture and design of big data and machine learning initiatives. He has architected major platforms for personalization, ads serving, NoSQL, and cloud infrastructure.
Jun Shi is a Principal Engineer at Yahoo who specializes in machine learning platforms and large-scale machine learning algorithms. Prior to Yahoo, he was designing wireless communication chips at Broadcom, Qualcomm and Intel.
Mridul Jain is Senior Principal at Yahoo, focusing on machine learning and big data platforms (especially realtime processing). He has worked on trending algorithms for search, unstructured content extraction, realtime processing for central monitoring platform, and is the co-author of Pig on Storm.
Spark adds some abstractions and generalizations and performance optimizations to achieve much better efficiency especially in iterative workloads. Yet, spark does not concern itself with being a data file system while Hadoop has what is called HDFS.
Spark can leverage existing distributed files systems (like HDFS), a distributed data base (like HBase), traditional databases through its JDBC or ODBC adaptors, and flat files in local file systems or on a file store like S3 in Amazon cloud.
Hadoop MapReduce framework is similar to Spark in that it uses master slave-like paradigm. It has one Master node (which consists of a job tracker, name node, and RAM) and Worker Nodes (each worker node consists of a task tracker, data node, and a RAM). The task tracker in a worker node is analogues to an executor in Spark environment.
This document discusses ZooKeeper, an open-source server that enables distributed coordination. It provides instructions for installing ZooKeeper, describes ZooKeeper's data tree and API, and exercises for interacting with ZooKeeper including creating znodes, using watches, and setting up an ensemble across multiple servers.
Open Liberty is an open source lightweight Java runtime optimized for cloud-native applications. It provides a small footprint, high performance runtime with only the necessary APIs and features loaded. This allows for fast startup times, low memory usage, and continuous delivery. Open Liberty is optimized for containers and Kubernetes and provides zero migration between versions and platforms.
.NET Core, ASP.NET Core Course, Session 4Amin Mesbahi
Session 4,
What is Garbage Collector?
Fundamentals of memory
Conditions for a garbage collection
Generations
Configuring garbage collection
Workstation
Server
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
Apache Hadoop YARN is a modern resource-management platform that handles resource scheduling, isolation and multi-tenancy for a variety of data processing engines that can co-exist and share a single data-center in a cost-effective manner.
In the first half of the talk, we are going to give a brief look into some of the big efforts cooking in the Apache Hadoop YARN community.
We will then dig deeper into one of the efforts - supporting Docker runtime in YARN. Docker is an application container engine that enables developers and sysadmins to build, deploy and run containerized applications. In this half, we'll discuss container runtimes in YARN, with a focus on using the DockerContainerRuntime to run various docker applications under YARN. Support for container runtimes (including the docker container runtime) was recently added to the Linux Container Executor (YARN-3611 and its sub-tasks). We’ll walk through various aspects of running docker containers under YARN - resource isolation, some security aspects (for example container capabilities, privileged containers, user namespaces) and other work in progress features like image localization and support for different networking modes.
Speakers:
Vinod Kumar Vavilapalli is the Hadoop YARN and MapReduce guy at Hortonworks. He is a long term Hadoop contributor at Apache, Hadoop committer and a member of the Apache Hadoop PMC. He has a Bachelors degree from Indian Institute of Technology Roorkee in Computer Science and Engineering. He has been working on Hadoop for nearly 9 years and he still has fun doing it. Straight out of college, he joined the Hadoop team at Yahoo! Bangalore, before Hortonworks happened. He is passionate about using computers to change the world for better, bit by bit.
Sidharta Seethana is a software engineer at Hortonworks. He works on the YARN team, focussing on bringing new kinds of workloads to YARN. Prior to joining Hortonworks, Sidharta spent 10 years at Yahoo! Inc., working on a variety of large scale distributed systems for core platforms/web services, search and marketplace properties, developer network and personalization.
High Performance Computing - Cloud Point of Viewaragozin
This document discusses high performance computing in the cloud. It covers different types of workloads like I/O bound, CPU bound, and latency bound tasks. It also discusses handling task streams and structured batch jobs in the cloud. It proposes using techniques like worker pools, task queues, routing overlays, and task stealing for scheduling tasks. It discusses challenges around distributing large data sets across cloud resources and proposes solutions like caching data in memory grids. Finally, it argues that frameworks like Hadoop are not well suited for the cloud and proposes cloud-friendly alternatives like Peregrine and Spark.
This document discusses how to use Docker to containerize and deploy Python web applications. It provides steps to install Docker, build a sample Flask application into a Docker image, run the container locally, and deploy the containerized application to AWS. Key points covered include using Dockerfiles to create images, the Docker index for sharing images, and port mapping when running containers.
What is Apache Mesos and how to use it. A short introduction to distributed fault-tolerant systems with using ZooKeeper and Mesos. #installfest Prague 2014
ZooKeeper is a highly available, scalable, distributed configuration, consensus, group membership, leader election, naming and coordination service. It provides a hierarchical namespace and basic operations like create, delete, and read data. It is useful for building distributed applications and services like queues. Future releases will focus on monitoring improvements, read-only mode, and failure detection models. The community is working on features like children for ephemeral nodes and viewing session information.
Beyond x86: Managing Multi-platform Environments with OpenStackPhil Estes
A talk by Shaun Murakami and Phil Estes at the OpenStack Summit Paris, Fall 2014. We look at real-world scenarios deploying and managing workloads in a multi-platform environment of compute architectures including IBM System z (traditional mainframe), POWER, and Intel architectures. Moving beyond a homogeneous data center to a mix of enterprise architectures adds potential complexities around hypervisor support, deployment capabilities, and management of disparate workloads--of which some might be CPU-centric while others are not.
Using schedulers like Marathon and Aurora help to get your applications scheduled and executing on Mesos. In many cases it makes sense to build a framework and integrate directly. This talk will breakdown what is involved in building a framework, how to-do this with examples and why you would want to-do this. Frameworks are not only for generally available software applications (like Kafka, HDFS, Spark ,etc) but can also be used for custom internal R&D built software applications too.
Este documento introduce Pentaho Kettle, una herramienta ETL. Explica conceptos como transformaciones, pasos y trabajos. También cubre la instalación y uso del plugin OpenErp Kettle Step, el cual permite volcar datos a OpenERP de forma sencilla. Finalmente, proporciona detalles sobre características como clustering, ejecución y depuración.
The document summarizes challenges in continuing to improve single-processor performance and introduces multicore architectures as a solution. It discusses how the conventional wisdom in computer architecture has changed, noting issues like the power wall, memory wall, and limitations to extracting more instruction level parallelism (ILP). To overcome these challenges, architectures are moving to multiple cores per chip to improve parallelism and efficiency. Caches are area- and power-intensive, so multiple cores running at lower voltage and frequency can increase throughput while reducing power.
Este documento describe las herramientas de transformación de datos en Pentaho Data Integration. Presenta varios pasos de transformación como agregar checksums, constantes, secuencias y campos XML. También describe pasos para realizar cálculos, concatenar campos, reemplazar cadenas, crear rangos numéricos y seleccionar, reemplazar y ordenar valores de campos. Finalmente, incluye pasos para dividir campos, aplicar operaciones de cadena y eliminar filas duplicadas.
Docker permite la creación y ejecución de contenedores ligeros que comparten el kernel del sistema anfitrión. Docker Engine es una plataforma para aplicaciones distribuidas que permite crear, probar e implementar aplicaciones rápidamente. Docker Compose facilita la orquestación de múltiples contenedores mediante la configuración en un archivo YAML, mientras que Docker Machine permite la creación y administración de hosts Docker virtuales. Docker Swarm habilita la agrupación de nodos Docker en un clúster administrado de forma nativa.
Pentaho | Data Integration & Report designerHamdi Hmidi
Pentaho provides a suite of open source business intelligence tools for data integration, dashboarding, reporting, and data mining. It includes Pentaho Data Integration (Kettle) for ETL processes, Pentaho Dashboard for visualization dashboards, Pentaho Reporting for report generation, and incorporates Weka for data mining algorithms. Pentaho Report Designer is a visual report writer that allows querying data from various sources and generating reports in different formats like PDF, HTML, and Excel. It requires Java and involves downloading, unpacking, and installing the Pentaho reporting files.
Scaling Jenkins with Docker and KubernetesCarlos Sanchez
Docker is revolutionizing the way people think about applications and deployments. It provides a simple way to run and distribute Linux containers for a variety of use cases, from lightweight virtual machines to complex distributed micro-services architectures. Kubernetes is an open source project to manage a cluster of Linux containers as a single system, managing and running Docker containers across multiple Docker hosts, offering co-location of containers, service discovery and replication control. It was started by Google and now it is supported by Microsoft, RedHat, IBM and Docker Inc amongst others. Jenkins Continuous Integration environment can be dynamically scaled by using the Kubernetes and Docker plugins, using containers to run slaves and jobs, and also isolate job execution.
An introduction to Docker native clustering: Swarm.
Deployment and configuration, integration with Consul, for a product-like cluster to serve web-application with multiple containers on multiple hosts. #dockerops
This document provides an introduction to Docker Swarm, which allows multiple Docker hosts to be clustered together into a single virtual Docker host. It discusses key components of Docker Swarm including managers, nodes, services, discovery services, and scheduling. It also provides steps for creating a Swarm cluster, deploying services, and considering high availability and security aspects.
Este documento describe Docker Compose, una herramienta que permite orquestar aplicaciones que consisten de múltiples contenedores Docker. Docker Compose define y ejecuta aplicaciones con múltiples servicios en un solo archivo de configuración. Permite definir y compartir volúmenes de datos entre contenedores, así como dependencias entre servicios para garantizar que los contenedores se inicien en el orden correcto. El documento incluye un ejemplo de archivo docker-compose.yml que define dos servicios: una base de datos PostgreSQL y un servidor web Nginx que se conecta a la
NGINX Plus PLATFORM For Flawless Application DeliveryAshnikbiz
Flawless Application Delivery using Nginx Plus
By leveraging these latest features:
• Support for HTTP/2 standard
• Thread pools and socket sharding and how it can help improve performance
• NTLM support and new TCP security enhancements
• Advanced NGINX Plus monitoring, management and visibility of health & load checks
Catch this exclusive Google Hangout live!
November 4th, 2015 | 2.00-2.30PM IST | 4.30-5.00PM SGT
About the speaker: Sandeep Khuperkar, Director and CTO at Ashnik will be heading this session. He is an author, enthusiast and community moderator at opensource.com. He is also member of Open Source Initiative, Linux Foundation and Open Source Consortium Of India.
Business Intelligence and Big Data Analytics with Pentaho Uday Kothari
This webinar gives an overview of the Pentaho technology stack and then delves deep into its features like ETL, Reporting, Dashboards, Analytics and Big Data. The webinar also facilitates a cross industry perspective and how Pentaho can be leveraged effectively for decision making. In the end, it also highlights how apart from strong technological features, low TCO is central to Pentaho’s value proposition. For BI technology enthusiasts, this webinar presents easiest ways to learn an end to end analytics tool. For those who are interested in developing a BI / Analytics toolset for their organization, this webinar presents an interesting option of leveraging low cost technology. For big data enthusiasts, this webinar presents overview of how Pentaho has come out as a leader in data integration space for Big data.
Pentaho is one of the leading niche players in Business Intelligence and Big Data Analytics. It offers a comprehensive, end-to-end open source platform for Data Integration and Business Analytics. Pentaho’s leading product: Pentaho Business Analytics is a data integration, BI and analytics platform composed of ETL, OLAP, reporting, interactive dashboards, ad hoc analysis, data mining and predictive analytics.
Docker Ecosystem: Engine, Compose, Machine, Swarm, RegistryMario IC
El documento presenta una introducción a Docker y sus principales componentes como Docker Engine, Docker Compose, Docker Machine, Docker Swarm y Docker Registry. Explica que Docker Engine es una plataforma para aplicaciones distribuidas que permite crear y ejecutar contenedores de forma aislada. Docker Compose facilita la orquestación de múltiples contenedores mediante archivos de configuración YAML. Docker Machine permite crear y gestionar nodos Docker virtuales y Docker Swarm proporciona funcionalidades de clústeres para contenedores. Por último, Docker Registry permite almacen
GPU computing provides a way to access the power of massively parallel graphics processing units (GPUs) for general purpose computing. GPUs contain over 100 processing cores and can achieve over 500 gigaflops of performance. The CUDA programming model allows programmers to leverage this parallelism by executing compute kernels on the GPU from their existing C/C++ applications. This approach democratizes parallel computing by making highly parallel systems accessible through inexpensive GPUs in personal computers and workstations. Researchers can now explore manycore architectures and parallel algorithms using GPUs as a platform.
Scaling Jenkins with Docker: Swarm, Kubernetes or Mesos?Carlos Sanchez
The Jenkins platform can be dynamically scaled by using several Docker cluster and orchestration platforms, using containers to run slaves and jobs and also isolating job execution. But which cluster technology should be used? Docker Swarm? Apache Mesos? Kubernetes? How do they compare? All of them can be used to dynamically run jobs inside containers. This talk will cover these main container clusters, outlining the pros and cons of each, the current state of the art of the technologies and Jenkins support.
This document provides an overview and examples of various data integration and transformation techniques in Pentaho Data Integration including:
- Loading data from flat files and databases into destinations like databases and Microsoft Analysis Services using JDBC connections.
- Performing different types of joins like inner joins, outer joins, cartesian joins, and multi-way joins on data from different sources.
- Executing string operations on data fields.
- Loading data into and generating reports from Microsoft SQL Server Reporting Services.
- Sorting and joining data from multiple sources.
- Aggregating data using functions like sum, average, min, max.
- Outputting transformed data to Excel and Access databases.
Building Data Integration and Transformations using PentahoAshnikbiz
This presentation will showcase the Data Integration capabilities of Pentaho which helps in building data transformations, through two demonstrations:
- How to build your first transformation to extract, transform and blend the data from various data sources
- How to add additional steps and filters to your transformation
Load Balancing Apps in Docker Swarm with NGINXNGINX, Inc.
On-demand webinar recording: http://bit.ly/2mRjk2g
Docker and other container technologies continue to gain in popularity. We recently surveyed the broad community of NGINX and NGINX Plus users and found that two-thirds of organizations are either investigating containers, using them in development, or using them in production. Why? Because abstracting your applications from the underlying infrastructure makes developing, distributing, and running software simpler, faster, and more robust than ever before.
But when you move from running your app in a development environment to deploying containers in production, you face new challenges – such as how to effectively run and scale an application across multiple hosts with the performance and uptime that your customers demand.
The latest Docker release, 1.12, supports multihost container orchestration, which simplifies deployment and management of containers across a cluster of Docker hosts. In a complex environment like this, load balancing plays an essential part in delivering your container-based application with reliability and high performance.
Join us in this webinar to learn:
* The basic built-in load balancing options available in Docker Swarm Mode
* The pros and cons of moving to an advanced load balancer like NGINX
* How to integrate NGINX and NGINX Plus with Swarm Mode to provide an advanced load-balancing solution for a cluster with orchestration
* How to scale your Docker-based application with Swarm Mode and NGINX Plus
EDW CENIPA is a opensource project designed to enable analysis of aeronautical incidentes that occured in the brazilian civil aviation. The project uses techniques and BI tools that explore innovative low-cost technologies. Historically, Business Intelligence platforms are expensive and impracticable for small projects. BI projects require specialized skills and high development costs. This work aims to break this barrier.
Docker Swarm allows managing multiple Docker hosts as a single virtual Docker engine. The presenter demonstrates setting up a traditional Docker Swarm cluster with an external key-value store and load balancer. SwarmKit provides the core components of Docker Swarm as standalone binaries. Docker Swarm Mode is integrated directly into Docker Engine 1.12 and later, providing built-in orchestration without external components. The presenter then demonstrates a tutorial using Docker Swarm Mode to deploy a multi-container voting application across 3 Docker hosts and scale the service.
NSC #2 - D3 02 - Peter Hlavaty - Attack on the CoreNoSuchCon
This document discusses kernel exploitation techniques. It begins by explaining the KernelIo technique for reading and writing kernel memory on Windows and Linux despite protections like SMAP and SMEP. It then discusses several vulnerability cases that can enable KernelIo like out of bounds writes, kmalloc overflows, and abusing KASLR. Next, it analyzes design flaws in kernels like linked lists, hidden pointers, and callback mechanisms. It evaluates the state of exploitation on modern systems and envisions future hardened operating system designs. It advocates moving to C++ for exploitation development rather than shellcoding and introduces a C++ exploitation framework. The document was presented by Peter Hlavaty of the Keen Team and encourages recruitment for vulnerability research.
This document introduces OpenCL, a framework for parallel programming across heterogeneous systems. OpenCL allows developers to write programs that access GPU and multi-core processors. It provides portability so the same code can run on different processor architectures. The document outlines OpenCL programming basics like kernels, memory objects, and host code that manages kernels. It also provides a simple "Hello World" example of vector addition in OpenCL and recommends additional resources for learning OpenCL.
This document provides an overview of embedded Linux for an embedded systems design course. It discusses various commercial and open source embedded Linux distributions and their characteristics. It also covers important topics for embedded Linux including tool chains, the Linux kernel, debugging, driver development, memory management, and synchronization techniques. Example code snippets are provided for common Linux system programming tasks like file I/O, processes, threads, IPC, signals, and sockets. Cross-compiling for embedded targets is also briefly explained.
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...AMD Developer Central
Presentation WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael Sevenier, at the AMD Developer Summit (APU13) November 11-13, 2013.
This document provides an overview and introduction to OpenCL, including:
- OpenCL allows for portable, parallel programming across heterogeneous systems like CPUs and GPUs.
- The OpenCL platform model uses kernels executed across a domain of work-items to parallelize work, with work organized into work-groups for synchronization.
- Memory is explicitly managed, with private, local, and global memory spaces accessible to kernels via the memory model.
- The host program sets up the OpenCL context and devices, builds kernels, manages memory objects, and submits commands via command queues to execute kernels and synchronize work.
This presentation by Stanislav Donets (Lead Software Engineer, Consultant, GlobalLogic, Kharkiv) was delivered at GlobalLogic Kharkiv C++ Workshop #1 on September 14, 2019.
In this talk were covered:
- Graphics Processing Units: Architecture and Programming (theory).
- Scratch Example: Barnes Hut n-Body Algorithm (practice).
Conference materials: https://www.globallogic.com/ua/events/kharkiv-cpp-workshop/
This document provides an agenda and overview for an introduction to OpenCL course. The agenda includes lectures on understanding host programs, kernel programs, memory models, and optimization. Course materials include OpenCL reference cards, specifications, and exercises. An introduction to OpenCL explains that it is an open standard for parallel programming across heterogeneous systems like CPUs and GPUs. The OpenCL platform model includes devices like GPUs that are divided into compute units and processing elements. Kernels define work-items that execute problems in parallel over a domain.
Microsoft is working hard to modernize the .NET Platform. There are great new frameworks and tools coming, such as .NET Core and ASP.NET Core. The amount of new things is overwhelming, with multiple .NET Platforms (.NET Framework, Unified Windows Platform, .NET Core), multiple runtimes (CoreCLR, CLR, CoreRT), multiple compilers (Roslyn, RyuJIT, .NET Native and LLILC) and much more. This session will bring you up to speed on all this new Microsoft technology, focusing on .NET Core.
But, we will also take a look at the first framework implementation on top op .NET Core for the Web: ASP.NET Core 1.0. You will learn about ASP.NET Core 1.0 and how it is different from ASP.NET 4.6. This will include Visual Studio 2015 support, cross-platform ASP.NET Core and command-line tooling for working with ASP.NET Core and .NET Core projects.
After this session you know where Microsoft is heading in the near future. Be prepared for a new .NET Platform.
Application Profiling for Memory and Performancepradeepfn
This document discusses application profiling for memory and performance. It explains that as concurrency increases, throughput initially increases but contention can then reduce performance. The key resources that can cause contention are CPU, memory, disk I/O, and network I/O. Various tools like JProfiler and JConsole can measure and diagnose contention. Common issues uncovered by profiling include memory leaks, deadlocks, and permgen errors. Profiling is important to optimize applications for production use.
Application Profiling for Memory and PerformanceWSO2
This document discusses application profiling for memory and performance. It describes how to measure contention in CPU, memory, disk I/O, and network I/O as concurrency increases. It recommends performance tuning by identifying bottlenecks and shifting them through parameter tweaking and code profiling. Common issues like classcast exceptions, permgen errors, deadlocks, nullpointers, and outofmemoryexceptions can be addressed through profiling tools like JProfiler, Eclipse Memory Analyzer, and JConsole. The document provides examples of how WSO2 uses profiling to optimize products like the Identity Server for low-memory environments and Raspberry Pi clusters.
Infrastructure testing with Molecule and TestInfraTomislav Plavcic
This document discusses infrastructure as code testing using Molecule and TestInfra. It provides an overview of infrastructure as code, benefits of testing IaC, and introduces the Molecule and TestInfra tools. Molecule is used for testing Ansible roles and supports multiple operating systems, distributions, and providers. TestInfra allows writing unit tests in Python to test the configuration of servers managed by tools like Ansible. Examples are provided of using Molecule to create and test roles and using TestInfra modules to write tests.
Building machine learning applications locally with Spark — Joel Pinho Lucas ...PAPIs.io
In times of huge amounts of heterogeneous data available, processing and extracting knowledge requires more and more efforts on building complex software architectures. In this context, Apache Spark provides a powerful and efficient approach for large-scale data processing. This talk will briefly introduce a powerful machine learning library (MLlib) along with a general overview of the Spark framework, describing how to launch applications within a cluster. In this way, a demo will show how to simulate a Spark cluster in a local machine using images available on a Docker Hub public repository. In the end, another demo will show how to save time using unit tests for validating jobs before running them in a cluster.
Building machine learning applications locally with sparkJoel Pinho Lucas
In times of huge amounts of heterogeneous data available, processing and extracting knowledge requires more and more efforts on building complex software architectures. In this context, Apache Spark provides a powerful and efficient approach for large-scale data processing. This talk will briefly introduce a powerful machine learning library (MLlib) along with a general overview of the Spark framework, describing how to launch applications within a cluster. In this way, a demo will show how to simulate a Spark cluster in a local machine using images available on a Docker Hub public repository. In the end, another demo will show how to save time using unit tests for validating jobs before running them in a cluster.
Latest (storage IO) patterns for cloud-native applications OpenEBS
Applying micro service patterns to storage giving each workload its own Container Attached Storage (CAS) system. This puts the DevOps persona within full control of the storage requirements and brings data agility to k8s persistent workloads. We will go over the concept and the implementation of CAS, as well as its orchestration.
This document discusses developing, delivering, and running Oracle ADF applications with Docker containers. It provides an overview of using containers and Docker to build application images, deploy them to Kubernetes clusters in the cloud, and set up continuous delivery pipelines for automated testing and deployment. Sample applications are packaged into Docker containers along with required dependencies. Kubernetes is used to orchestrate and manage container deployments across different environments.
Raffaele Rialdi discusses building plugins for Node.js that interface with .NET Core. He covers hosting the CoreCLR from C++, building a C++ V8 addon, and introduces xcore which allows calling .NET from JavaScript/TypeScript by analyzing metadata and optimizing performance. Examples show loading CLR, creating a .NET object, and calling methods from JavaScript using xcore. Potential use cases include Node.js apps, Electron apps, scripting Powershell, and Nativescript mobile apps.
Docker allows building portable software that can run anywhere by packaging an application and its dependencies in a standardized unit called a container. Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes can replicate containers, provide load balancing, coordinate updates between containers, and ensure availability. Defining applications as Kubernetes resources allows them to be deployed and updated easily across a cluster.
This document provides information on building a high performance computing cluster, including definitions of supercomputers, why they are needed, types of supercomputers, and steps for building a cluster. It outlines identifying the application, selecting hardware and software components, installation, configuration, testing, and maintenance. Homemade and commercial clusters are compared, and opportunities for generating revenue from clusters are discussed. Additional online resources for learning more are provided at the end.
CoreCLR is the OSS release of CLR as we know it. Today we are able to build CLR in Debug configuration, set breakpoints and gather additional logging info available so far only to Microsoft .NET team.
Http2 is here! And why the web needs itIndicThreads
The document summarizes the evolution of HTTP from versions 0.9 to 2.0. It outlines the limitations of HTTP/1.1 for modern web pages with many dependent resources. HTTP/2 aims to address these limitations through features like multiplexing, header compression, server push and priority to reduce latency. It discusses implementations of HTTP/2 and the impact on developers. The document also briefly mentions upcoming protocols like QUIC that build on HTTP/2 to further optimize performance.
Understanding Bitcoin (Blockchain) and its Potential for Disruptive ApplicationsIndicThreads
Presented at the IndicThreads.com Software Development Conference 2016 held in Pune, India. More at http://www.IndicThreads.com and http://Pune16.IndicThreads.com
--
Go Programming Language - Learning The Go Lang wayIndicThreads
The document summarizes a presentation on the Go programming language. It covers the basics of Go including that it is open source, has no semicolons, uses namespaces and the "main" keyword. It then walks through examples of printing multiplication tables, using arrays and slices, testing code, concurrency using goroutines and channels, working with structs and interfaces. The presentation highlights Go's simplicity, reliability and efficiency and provides a GitHub link for the example code.
Presented at the IndicThreads.com Software Development Conference 2016 held in Pune, India. More at http://www.IndicThreads.com and http://Pune16.IndicThreads.com
--
The document outlines a presentation on building web applications with Go and Angular. It will demonstrate hosting a Go-based web server to define REST APIs that an Angular client application can consume. The presentation will cover setting up a Go HTTP handler to return JSON, building APIs with parameters, and integrating Angular templates, forms, and navigation to call the APIs and display dynamic content from the responses. Code examples and a GitHub repository will be provided.
Building on quicksand microservices indicthreadsIndicThreads
The document discusses the evolution of distributed systems from single machines to replicated databases and services. It explains how eventual consistency allows for higher availability but reduces correctness by allowing stale data reads. The key is that different applications have different consistency needs based on their risk tolerance. Rather than strict consistency, eventual consistency with apologies is often sufficient and enables more flexible tradeoffs between correctness and availability for increased business value.
How to Think in RxJava Before ReactingIndicThreads
Presented at the IndicThreads.com Software Development Conference 2016 held in Pune, India. More at http://www.IndicThreads.com and http://Pune16.IndicThreads.com
--
Presented at the IndicThreads.com Software Development Conference 2016 held in Pune, India. More at http://www.IndicThreads.com and http://Pune16.IndicThreads.com
--
Presented at the IndicThreads.com Software Development Conference 2016 held in Pune, India. More at http://www.IndicThreads.com and http://Pune16.IndicThreads.com
--
IoT testing and quality assurance indicthreadsIndicThreads
The document discusses testing for Internet of Things (IoT) software. It begins with an introduction to IoT and describes emerging IoT applications and the typical IoT technology stack. It then discusses challenges in testing IoT software and how the role of quality assurance is changing. The document outlines various areas of IoT testing including connectivity, security, performance, functionality and more. It provides examples of test cases for each area. Finally, it proposes a strategy for effective IoT software testing that emphasizes automation, virtualization, robust backends, and testing at the design stage.
Functional Programming Past Present FutureIndicThreads
Presented at the IndicThreads.com Software Development Conference 2016 held in Pune, India. More at http://www.IndicThreads.com and http://Pune16.IndicThreads.com
--
Harnessing the Power of Java 8 Streams IndicThreads
Presented at the IndicThreads.com Software Development Conference 2016 held in Pune, India. More at http://www.IndicThreads.com and http://Pune16.IndicThreads.com
--
Building & scaling a live streaming mobile platform - Gr8 road to fameIndicThreads
Presented at the IndicThreads.com Software Development Conference 2016 held in Pune, India.
More at http://www.IndicThreads.com
--
Internet of things architecture perspective - IndicThreads ConferenceIndicThreads
Internet of Things is gaining unprecedented amount of traction across the globe. And the large organizations are making huge investments on IoT, which is going to change the shape of the 'Connected World'. Hence, it becomes necessarily important to understand the components, technologies and their interaction in the world of IoT.
The session would cover the Introduction of IoT, its components, the forces that have brought the ecosystem to mainstream and its adoption across industries. Then along with the Reference Architecture, I would discuss a few of industry implementations in IOT area with reference to the architecture. Next would be a comparative analysis of various IOT platforms available in the market and their architectures. And finally I would take up the challenges in making IOT as pervasive as it is believed to be.
A key take away would be the architectural appreciation of IOT landscape. As of now, any and every player in the market has begun to advertise their product as an IOT platform but a comprehensive review of fundamental design and architecture would bring this plethora of products (including open source ones) in the right purview. And that's the objective of this talk.
Session at the IndicThreads.com Confence held in Pune, India on 27-28 Feb 2015
http://www.indicthreads.com
http://pune15.indicthreads.com
Cars and Computers: Building a Java CarputerIndicThreads
The average family car of today has significantly more computing power than got the first astronauts to the moon and back. Modern cars contain more and more computers to monitor and control every aspect of driving, from anti-lock brakes to engine management to satellite navigation.
This session will look at how Java can (and is) used in cars to add more data collection. This will cover a project that was written to collect a variety of data from a car whilst driving (including video) and then play it back later so driving style and performance could be evaluated. There will be plenty of demonstrations.
Session at the IndicThreads.com Confence held in Pune, India on 27-28 Feb 2015
http://www.indicthreads.com
http://pune15.indicthreads.com
Remember the last time you tried to write a MapReduce job (obviously something non trivial than a word count)? It sure did the work, but has lot of pain points from getting an idea to implement it in terms of map reduce. Did you wonder how life will be much simple if you had to code like doing collection operations and hence being transparent* to its distributed nature? Did you want/hope for more performant/low latency jobs? Well, seems like you are in luck.
In this talk, we will be covering a different way to do MapReduce kind of operations without being just limited to map and reduce, yes, we will be talking about Apache Spark. We will compare and contrast Spark programming model with Map Reduce. We will see where it shines, and why to use it, how to use it. We’ll be covering aspects like testability, maintainability, conciseness of the code, and some features like iterative processing, optional in-memory caching and others. We will see how Spark, being just a cluster computing engine, abstracts the underlying distributed storage, and cluster management aspects, giving us a uniform interface to consume/process/query the data. We will explore the basic abstraction of RDD which gives us so many awesome features making Apache Spark a very good choice for your big data applications. We will see this through some non trivial code examples.
Session at the IndicThreads.com Confence held in Pune, India on 27-28 Feb 2015
http://www.indicthreads.com
http://pune15.indicthreads.com
Continuous Integration (CI) and Continuous Delivery (CD) using Jenkins & DockerIndicThreads
Continuous Integration (CI) is one of the most important tenets of agile practices. And Continuous Delivery (CD) is impossible without continuous integration. All practices are good and enhance productivity when other good practices and tools back them. For example CI & CD without proper automation test cases can be a killer. It kills the team productivity and puts deliver on risk. Via this session I will try to share my experiences of how CI and CD can be done in optimized fashion (specifically for feature branch based development approach)
We will discuss the best practices and ways of ensuring proper CI and CD in feature branch based development approach.
I will showcase an automated Jenkins based setup, which is geared to ensure that all feature branches and master remain in cohesive harmony.
At the end we will conclude on what are the essential components for ensuring successful CI and CD. We will also discuss what are the associated must haves to make it a success.
Take away for participants
1. Understanding of CI and CD and how CI can lead to CD.
2. How a devops engineer can leverage Jenkins and scripting to automate the CI and CD for feature branch based development.
3. Demo of CI setup devloped on Jenkins.
4. Generic understanding and Q&A related to CI and CD.
5. Learning of how docker can be used in such scenarios.
Session at the IndicThreads.com Confence held in Pune India on 27-28 Feb 2015
http://www.indicthreads.com
http://pune15.indicthreads.com
Speed up your build pipeline for faster feedbackIndicThreads
In this talk I will share how we brought down our Jenkins build pipeline time down from over 90 minutes to under 12 minutes. I will share specific techniques which helped and also some, which logically made sense, but actually did not help. If your team is trying to optimize their build times, then this session might give you some ideas on how to approach the problem.
Development Impact – The number of builds in a day have increased over a period of time as the build time has reduced. Frequency of code check-in has increased; Wait time has reduced; failed test case faster to isolate and fix.
The sessions will look at: Why long running pipeline was hurting, Key Principles to Speed Up Your Build Pipeline, Bottlenecks , Disk IO examples and alternatives, Insights from CPU Profiling, Divide and Conquer, Fail Fast, Results
The talk will highlight: Importance of getting fast feedback, How to investigate long running tests, How to run Tests concurrently, RAM Disks, SSD, Hybrid disks, Why you should not assume; but validate your hypothesis.
Session at the IndicThreads.com Confence held in Pune India on 27-28 Feb 2015
http://www.indicthreads.com
http://pune15.indicthreads.com
OpenStack – an open source initiative for cloud management – has become a sensation is today’s Infrastructure as a Service (IaaS) cloud space. With more than 10 subprojects to manage server, storage, network, security and monitoring of the cloud, OpenStack has provided a competitive and scalable open source solution in cloud space. Big giants in public and private cloud such as VMware, Amazon and IBM are actively investing into OpenStack and developing their products to integrate with it.
The session will talk about the architecture of OpenStack and will discuss why it has become a differentiating factor for business in cloud space through scalability, automation, intuitiveness and flexibility. The session will also discuss how it integrates with the Platform as a Service (PaaS) layer and scales to public and private cloud.
The session will also contain a live demo of how a simple private cloud can be set up using OpenStack. The demo will explain how OpenStack makes the cloud management easy even for universities and small enterprises to rapidly adapt to their business needs at almost no costs.
Finally, the session will discuss current challenges and trends in OpenStack community and how can one contribute to OpenStack as an enterprise or individual.
The speaker leads development of IBM’s new OpenStack based Infrastructure As A Service (IaaS) solution and will share his insights into OpenStack services and components.
Session at the IndicThreads.com Confence held in Pune, India on 27-28 Feb 2015
http://www.indicthreads.com
http://pune15.indicthreads.com
Digital Transformation of the Enterprise. What IT leaders need to know!IndicThreads
This presentation will be about the changing times and nature of IT services delivered to the consumer. In the past, it used to be delivered through thick or thin clients on the desktop. Today, these are primarily delivered to the mobile in the form of a digital service.
While a lot of talk is about disruption that the smart phones have brought, the truth is, that the backend has to be more industrialised than ever before due to the massive number of transactions that terminate in the legacy IT infrastructure. Companies need both, industrial IT and innovation IT to be able to compete effectively in the digital marketplace. This presentation will be about the different imperatives the new IT leaders have to think about in the digital era.
Session at the IndicThreads.com Confence held in Pune, India on 27-28 Feb 2015
http://www.indicthreads.com
http://pune15.indicthreads.com
7. Introduction to OpenCL
• Open Compute Language, C- like language.
• Framework for writing parallel algorithms
• Heterogeneous platforms
• Developed by Apple
• Is an open standard and controlled by Khronos
group
8. Example of adding two vectors
Serial version
For(i=1 to n)
c[i]= a[i]+b[i];
Using OpenCL
_kernel add(a,b,c)
{
int i =get_global_id(); //get thread id
c[i]=a[i]+b[i];
}
9. OpenCL Architecture
1. Platform model
2. Execution model
3. Memory model
4. Programming model
11. OpenCL-Execution Model
_kernel add(a,b,c)
1. Kernel {
2. Work-items int i =get_global_id();//get thread/workitem id
c[i]=a[i]+b[i];
3. Work group }
4. ND-range
5. Program
6. Memory
objects
7. Command
queues
12. Memory Model in OpenCL
Compute Device
Private register Private register Private register
Compute unit 0 Compute unit 1 Compute unit 2
Local memory/cache Local memory/cache Local memory/cache
Global constant memory-DRAM
Global Memory -DRAM
13. Programming model
1. Data parallel-single function on multiple data
2. Task parallel-Multiple functions on single data
15. Essential Development Tasks
C-code with restrictions
Initialize Initiate
Execute Read back
Parallelize Code Kernel OpenCL kernels and
kernel data to host
environment data
16. Essential Development Tasks
• Query compute device
• Create context
• Compile kernels
Initialize Initiate
Execute Read back
Parallelize Code Kernel OpenCL kernels and
kernel data to host
environment data
17. Essential Development Tasks
• Create memory objects
• Map data structures to OpenCL
supported data structures.
• Initialize kernel parameters
Initialize Initiate
Execute Read back
Parallelize Code Kernel OpenCL kernels and
kernel data to host
environment data
18. Essential Development Tasks
• Specify number of threads to
execute task
• Trigger the execution of kernel-
sync or async
Initialize Initiate
Execute Read back
Parallelize Code Kernel OpenCL kernels and
kernel data to host
environment data
19. Essential Development Tasks
• Map to application datastructure
Initialize Initiate
Execute Read back
Parallelize Code Kernel OpenCL kernels and
kernel data to host
environment data
20. Introduction to WebCL
• Java Script bindings for OpenCL
• First announced in March 2011 by Khronos
• API definition underway
• Prototype plugin is available only for Firefox
browser
21. Binding OpenCL to WebCL
CPU
Host application JavaScript
WebCL
OpenCL
OpenCL Framework compliant
device
23. Applications of OpenCL
• Database mining
• Neural networks
• Physics based simulation,mechanics
• Image processing
• Speech processing
• Weather forecasting and climate research
• Bioinformatics
24. Conclusion
• Significant performance gains in using OpenCL
for computations in client-side environments
like HTML5
• Algorithms need to be ‘parallelizable’
• Further optimizations can be achieved by
exploiting memory model
25. Software/Hardware used in demo application
Hardware
Intel(R) Core(TM)2 Quad core CPU Q8400 @ 2.66GHz
Nvidia 160m Quadro 8 cores @ 580 MHz
Software
OpenCL runtime for CPU
http://software.intel.com/en-us/articles/vcsource-
tools-opencl-sdk/
OpenCL runtime for GPU
http://www.nvidia.com/object/quadro_nvs_notebook.
html
WebCL plugin for Firefox
http://webcl.nokiaresearch.com/
#4: What are multicores?Compare Cpu and GpuAdvantages of using GPUHow is it possible to do so.Moors lawSimple questions
#5: -Explain network diagram-how to render-performance issue-solve it using parallel computation opencl technologyUse different color force directed.Real time use ,giving imp to diagram than textReplace Force directed algorithm with simple lay out algo.
#6: Fontext for demoPause and explain details,with variation in speech
#8: Brief OpenCl intro.Dreaft and open specification.Parallel only!!!Thread amnagement,synch is simple
#10: Why this models?How architecture helps in improving performance