The document discusses the OpenQuake Infomall, which aims to provide earthquake data, simulations, and analysis tools as cloud-based services, enabling researchers to access and share resources and build workflows linking different services. It notes important trends like data growth, parallel computing on multicore systems and clouds, and the potential for "X as a Service" delivery models to improve collaboration and reproducibility in earthquake science. Key challenges include standardizing interfaces to allow interoperability between different data sources and analysis tools.
The document summarizes the Open Grid Computing Environments (OGCE) software tools for building science gateways. It describes several key components:
1) The OGCE Gadget Container allows building portals out of Google gadgets and supports workflows, registries, and experiments.
2) Tools like XBaya allow composing scientific workflows that can run on resources like the TeraGrid.
3) The software is open source and can be used individually or together to power science gateways and provide interfaces and services to computational resources.
The document summarizes a tutorial presentation about the Open Grid Computing Environments (OGCE) software tools for building science gateways. The OGCE tools include a gadget container, workflow composer called XBaya, and application factory service called GFac. The presentation demonstrates how these tools can be used to build portals and compose workflows to access resources like the TeraGrid.
The document summarizes the Open Grid Computing Environments (OGCE) software and activities. It describes various OGCE software components like the gadget container, XBaya workflow composer, and GFAC application wrapper. It also discusses collaborations with gateways like UltraScan, GridChem, and SimpleGrid to integrate OGCE tools and provide more advanced support for workflows, job management, and other capabilities.
Rave is an Apache incubator project that provides tools for building science gateways using open standards. It allows creation of a downloadable portal using minimal configuration and provides ways for developers to customize and extend the portal. Rave uses a model-view-controller architecture and is implemented in JavaScript and Java with components like user management, widgets, and configuration files that can be modified by developers.
The document discusses Microsoft Research's ORECHEM project, which aims to integrate chemistry scholarship with web architectures, grid computing, and the semantic web. It involves developing infrastructure to enable new models for research and dissemination of scholarly materials in chemistry. Key aspects include using OAI-ORE standards to describe aggregations of web resources related to crystallography experiments. The objective is to build a pipeline that extracts 3D coordinate data from feeds, performs computations on resources like TeraGrid, and stores resulting RDF triples in a triplestore. RESTful web services are implemented to access different steps in the workflow.
Bringing complex event processing to Spark streamingDataWorks Summit
Complex event processing (CEP) is about identifying business opportunities and threats in real time by detecting patterns in data and taking appropriate automated action. Example business use cases for CEP include location-based marketing, smart inventories, targeted ads, Wi-Fi offloading, fraud detection, churn prediction, fleet management, predictive maintenance, security incident event management, and many more. While Spark Streaming provides a distributed resilient framework for ingesting events in real time, effort is still needed to build CEP applications. This is because CEP use cases require correlation of events, which in turn requires us to treat every incoming event as a discrete occurrence in time. Spark Streaming treats the entire batch of events as single occurrence. Many CEP use cases also require alerts to be fired even when there is no incoming event. An example of such use case is to fire an alert when an order-shipped event is NOT received within the SLA times following an order-received event. At Oracle we have adopted a few neat techniques like running continuous query engines as long running tasks, using empty batches as triggers, etc. to bring complex event processing to Spark Streaming.
Join us to learn more on CEP for Spark, the fastest growing data processing platform in the world.
Speakers
Prabhu Thukkaram, Senior Director, Product Development, Oracle
Hoyong Park, Architect, Oracle
Apache Spark 2.4 Bridges the Gap Between Big Data and Deep LearningDataWorks Summit
Big data and AI are joined at the hip: AI applications require massive amounts of training data to build state-of-the-art models. The problem is, big data frameworks like Apache Spark and distributed deep learning frameworks like TensorFlow don’t play well together due to the disparity between how big data jobs are executed and how deep learning jobs are executed.
Apache Spark 2.4 introduced a new scheduling primitive: barrier scheduling. User can indicate Spark whether it should be using the MapReduce mode or barrier mode at each stage of the pipeline, thus it’s easy to embed distributed deep learning training as a Spark stage to simplify the training workflow. In this talk, I will demonstrate how to build a real case pipeline which combines data processing with Spark and deep learning training with TensorFlow step by step. I will also share the best practices and hands-on experiences to show the power of this new features, and bring more discussion on this topic.
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraJoe Stein
Slides for our solution we developed for using Mesos, Docker, Kafka, Spark, Cassandra and Solr (DataStax Enterprise Edition) all developed in Go for doing realtime log analysis at scale. Many organizations either need or want log analysis in real time where you can see within a second what is happening within your entire infrastructure. Today, with the hardware available and software systems we have in place, you can develop, build and use as a service these solutions.
The document discusses LLAP (Live Long and Process), a new execution engine in Apache Hive 2.0 that enables sub-second analytical queries. LLAP keeps a small subset of frequently accessed data in memory to enable faster query processing times compared to traditional Hive architectures that rely on disk access. It works by running Hive query fragments simultaneously in both YARN containers and long-running daemon processes that cache data in memory. This allows for highly concurrent query execution without specialized YARN configurations. The document provides details on how LLAP is implemented and evaluates its performance benefits based on benchmarks and customer case studies.
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)Spark Summit
Spark on YARN provides resource management and security features through YARN, but still has areas for improvement. Dynamic allocation in YARN allows Spark applications to grow and shrink executors based on task demand, though latency and data locality could be enhanced. Security supports Kerberos authentication and delegation tokens, but long-lived applications face token expiration issues and encryption needs improvement for control plane, shuffle files, and user interfaces. Overall, usability, security, and performance remain areas of focus.
The document discusses tools and techniques used by Uber's Hadoop team to make their Spark and Hadoop platforms more user-friendly and efficient. It introduces tools like SCBuilder to simplify Spark context creation, Kafka dispersal to distribute RDD results, and SparkPlug to provide templates for common jobs. It also describes a distributed log debugger called SparkChamber to help debug Spark jobs and techniques like building a spatial index to optimize geo-spatial joins. The goal is to abstract out infrastructure complexities and enforce best practices to make the platforms more self-service for users.
Python in the Hadoop Ecosystem (Rock Health presentation)Uri Laserson
A presentation covering the use of Python frameworks on the Hadoop ecosystem. Covers, in particular, Hadoop Streaming, mrjob, luigi, PySpark, and using Numba with Impala.
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...DataWorks Summit
DeepLearning4J (DL4J) is a powerful Open Source distributed framework that brings Deep Learning to the JVM (it can serve as a DIY tool for Java, Scala, Clojure and Kotlin programmers). It can be used on distributed GPUs and CPUs. It is integrated with Hadoop and Apache Spark. ND4J is a Open Source, distributed and GPU-enabled library that brings the intuitive scientific computing tools of the Python community to the JVM. Training neural network models using DL4J, ND4J and Spark is a powerful combination, but the overall cluster configuration can present some unespected issues that can compromise performances and nullify the benefits of well written code and good model design. In this talk I will walk through some of those problems and will present some best practices to prevent them. The presented use cases will refer to DL4J and ND4J on different Spark deployment modes (standalone, YARN, Kubernetes). The reference programming language for any code example would be Scala, but no preliminary Scala knowledge is mandatory in order to better understanding the presented topics.
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Spark Summit
Machine learning is being deployed in a growing number of applications which demand real-time, accurate, and robust predictions under heavy query load. However, most machine learning frameworks and systems only address model training and not deployment.
In this talk, we present Clipper, a general-purpose low-latency prediction serving system. Interposing between end-user applications and a wide range of machine learning frameworks, Clipper introduces a modular architecture to simplify model deployment across frameworks. Furthermore, by introducing caching, batching, and adaptive model selection techniques, Clipper reduces prediction latency and improves prediction throughput, accuracy, and robustness without modifying the underlying machine learning frameworks. We evaluated Clipper on four common machine learning benchmark datasets and demonstrate its ability to meet the latency, accuracy, and throughput demands of online serving applications. We also compared Clipper to the Tensorflow Serving system and demonstrate comparable prediction throughput and latency on a range of models while enabling new functionality, improved accuracy, and robustness.
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeSpark Summit
This document discusses Apache Zeppelin, an open-source notebook for interactive data analytics. It provides an overview of Zeppelin's features, including interactive notebooks, multiple backends, interpreters, and a display system. The document also covers Zeppelin's adoption timeline, from its origins as a commercial product in 2012 to becoming an Apache Incubator project in 2014. Future projects involving Zeppelin like Helium and Z-Manager are also briefly described.
Workshop
December 9, 2015
LBS College of Engineering
www.sarithdivakar.info | www.csegyan.org
http://sarithdivakar.info/2015/12/09/wordcount-program-in-python-using-apache-spark-for-data-stored-in-hadoop-hdfs/
Cloud Operations with Streaming Analytics using Apache NiFi and Apache FlinkDataWorks Summit
The amount of information coming from a Cloud deployment, that could be used to have a better situational awareness, and operate it efficiently is huge. Tools as the ones provided by Apache foundation can be used to build a solution to that challenge.
Nowadays Cloud deployments are pervasive in businesses, with scalability and multi tenancy as their core capabilities. This means that these deployments can grow easily beyond 1000 nodes and efficient operation of these huge clusters requires real time log analysis, metrics, events and configuration data. Performing correlation and finding patterns, not just to get to root causes but also to predict failures and reduce risk requires tools that go beyond current solutions.
In the prototype developed by Red Hat and KEEDIO (keedio.com), we managed to address the above challenges with the use of Big Data tools like Apache NiFi, Apache Kafka and Apache Flink, that enabled us to process the constant stream of syslog messages (RFC5424) produced by the Infrastructure as a Service, provided by OpenStack services, and also detect common failure patterns that could arise and generate alerts as needed.
This session is an (Intermediate) talk in our Apache Nifi and Data Science track. It focuses on Apache Flink, Apache Nifi, Apache Kafka and is geared towards Architect, Data Scientist, Data Analyst, Developer / Engineer audiences.
Speaker
Miguel Perez Colino, Senior Design Product Manager, Red Hat
Suneel Marthi, Senior Principal Engineer, Red Hat
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
This document provides an overview and agenda for a presentation on Apache Flink. It begins with an introduction to Apache Flink and how it fits into the big data ecosystem. It then explains why Flink is considered the "4th generation" of big data analytics frameworks. Finally, it outlines next steps for those interested in Flink, such as learning more or contributing to the project. The presentation covers topics such as Flink's APIs, libraries, architecture, programming model and integration with other tools.
Building large scale applications in yarn with apache twillHenry Saputra
This document summarizes a presentation about Apache Twill, which provides abstractions for building large-scale applications on Apache Hadoop YARN. It discusses why Twill was created to simplify developing on YARN, Twill's architecture and components, key features like real-time logging and elastic scaling, real-world uses at CDAP, and the Twill roadmap.
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...Spark Summit
Notebooks: they enable our users, but they can cripple our clusters. Let’s fix that. Notebooks have soared in popularity at companies world-wide because they provide an easy, user-friendly way of accessing the cluster-computing power of Spark. But the more users you have hitting a cluster, the harder it is to manage the cluster resources as big, long-running jobs start to starve out small, short-running jobs. While you could have users spin up EMR-style clusters, this reduces the ability to take advantage of the collaborative nature of notebooks. It also quickly becomes expensive as clusters sit idle for long periods of time waiting on single users. What we want is fair, efficient resource utilization on a large single cluster for a large number of users. In this talk we’ll discuss dynamic allocation and the best practices for configuring the current version of Spark as-is to help solve this problem. We’ll also present new improvements we’ve made to address this use case. These include: decommissioning executors without losing cached data, proactively shutting down executors to prevent starvation, and improving the start times of new executors.
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesDataWorks Summit
Apache Druid supports auto-scaling of Middle Manager nodes to handle changes in data ingestion load. On Kubernetes, this can be implemented using Horizontal Pod Autoscaling based on custom metrics exposed from the Druid Overlord process, such as the number of pending/running tasks and expected number of workers. The autoscaler scales the number of Middle Manager pods between minimum and maximum thresholds to maintain a target average load percentage.
Microservices, Containers, and Machine LearningPaco Nathan
Session talk for Data Day Texas 2015, showing GraphX and SparkSQL for text analytics and graph analytics of an Apache developer email list -- including an implementation of TextRank in Spark.
Native Support of Prometheus Monitoring in Apache Spark 3.0Databricks
All production environment requires monitoring and alerting. Apache Spark also has a configurable metrics system in order to allow users to report Spark metrics to a variety of sinks. Prometheus is one of the popular open-source monitoring and alerting toolkits which is used with Apache Spark together.
This document discusses running Spark applications on YARN and managing Spark clusters. It covers challenges like predictable job execution times and optimal cluster utilization. Spark on YARN is introduced as a way to leverage YARN's resource management. Techniques like dynamic allocation, locality-aware scheduling, and resource queues help improve cluster sharing and utilization for multi-tenant workloads. Security considerations for shared clusters running sensitive data are also addressed.
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...Spark Summit
Today there are several compliance use cases — archiving, e-discovery, supervision + surveillance, to name a few — that appear naturally suited as Hadoop workloads but haven’t seen wide adoption. In this talk, we’ll discuss common limitations, how Apache Spark helps, and propose some new blueprints as to how to modernize this architecture and disrupt existing solutions. Additionally, we’ll discuss the rising role of Apache Spark in this ecosystem; leveraging machine learning and advanced analytics in a space that has traditionally been restricted to fairly rote reporting.
Apache Airavata is an open source science gateway software framework that allows users to compose, manage, execute, and monitor distributed computational workflows. It provides tools and services to register applications, schedule jobs on various resources, and manage workflows and generated data. Airavata is used across several domains to support scientific workflows and is largely derived from academic research funded by the NSF.
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraJoe Stein
Slides for our solution we developed for using Mesos, Docker, Kafka, Spark, Cassandra and Solr (DataStax Enterprise Edition) all developed in Go for doing realtime log analysis at scale. Many organizations either need or want log analysis in real time where you can see within a second what is happening within your entire infrastructure. Today, with the hardware available and software systems we have in place, you can develop, build and use as a service these solutions.
The document discusses LLAP (Live Long and Process), a new execution engine in Apache Hive 2.0 that enables sub-second analytical queries. LLAP keeps a small subset of frequently accessed data in memory to enable faster query processing times compared to traditional Hive architectures that rely on disk access. It works by running Hive query fragments simultaneously in both YARN containers and long-running daemon processes that cache data in memory. This allows for highly concurrent query execution without specialized YARN configurations. The document provides details on how LLAP is implemented and evaluates its performance benefits based on benchmarks and customer case studies.
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)Spark Summit
Spark on YARN provides resource management and security features through YARN, but still has areas for improvement. Dynamic allocation in YARN allows Spark applications to grow and shrink executors based on task demand, though latency and data locality could be enhanced. Security supports Kerberos authentication and delegation tokens, but long-lived applications face token expiration issues and encryption needs improvement for control plane, shuffle files, and user interfaces. Overall, usability, security, and performance remain areas of focus.
The document discusses tools and techniques used by Uber's Hadoop team to make their Spark and Hadoop platforms more user-friendly and efficient. It introduces tools like SCBuilder to simplify Spark context creation, Kafka dispersal to distribute RDD results, and SparkPlug to provide templates for common jobs. It also describes a distributed log debugger called SparkChamber to help debug Spark jobs and techniques like building a spatial index to optimize geo-spatial joins. The goal is to abstract out infrastructure complexities and enforce best practices to make the platforms more self-service for users.
Python in the Hadoop Ecosystem (Rock Health presentation)Uri Laserson
A presentation covering the use of Python frameworks on the Hadoop ecosystem. Covers, in particular, Hadoop Streaming, mrjob, luigi, PySpark, and using Numba with Impala.
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...DataWorks Summit
DeepLearning4J (DL4J) is a powerful Open Source distributed framework that brings Deep Learning to the JVM (it can serve as a DIY tool for Java, Scala, Clojure and Kotlin programmers). It can be used on distributed GPUs and CPUs. It is integrated with Hadoop and Apache Spark. ND4J is a Open Source, distributed and GPU-enabled library that brings the intuitive scientific computing tools of the Python community to the JVM. Training neural network models using DL4J, ND4J and Spark is a powerful combination, but the overall cluster configuration can present some unespected issues that can compromise performances and nullify the benefits of well written code and good model design. In this talk I will walk through some of those problems and will present some best practices to prevent them. The presented use cases will refer to DL4J and ND4J on different Spark deployment modes (standalone, YARN, Kubernetes). The reference programming language for any code example would be Scala, but no preliminary Scala knowledge is mandatory in order to better understanding the presented topics.
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Spark Summit
Machine learning is being deployed in a growing number of applications which demand real-time, accurate, and robust predictions under heavy query load. However, most machine learning frameworks and systems only address model training and not deployment.
In this talk, we present Clipper, a general-purpose low-latency prediction serving system. Interposing between end-user applications and a wide range of machine learning frameworks, Clipper introduces a modular architecture to simplify model deployment across frameworks. Furthermore, by introducing caching, batching, and adaptive model selection techniques, Clipper reduces prediction latency and improves prediction throughput, accuracy, and robustness without modifying the underlying machine learning frameworks. We evaluated Clipper on four common machine learning benchmark datasets and demonstrate its ability to meet the latency, accuracy, and throughput demands of online serving applications. We also compared Clipper to the Tensorflow Serving system and demonstrate comparable prediction throughput and latency on a range of models while enabling new functionality, improved accuracy, and robustness.
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeSpark Summit
This document discusses Apache Zeppelin, an open-source notebook for interactive data analytics. It provides an overview of Zeppelin's features, including interactive notebooks, multiple backends, interpreters, and a display system. The document also covers Zeppelin's adoption timeline, from its origins as a commercial product in 2012 to becoming an Apache Incubator project in 2014. Future projects involving Zeppelin like Helium and Z-Manager are also briefly described.
Workshop
December 9, 2015
LBS College of Engineering
www.sarithdivakar.info | www.csegyan.org
http://sarithdivakar.info/2015/12/09/wordcount-program-in-python-using-apache-spark-for-data-stored-in-hadoop-hdfs/
Cloud Operations with Streaming Analytics using Apache NiFi and Apache FlinkDataWorks Summit
The amount of information coming from a Cloud deployment, that could be used to have a better situational awareness, and operate it efficiently is huge. Tools as the ones provided by Apache foundation can be used to build a solution to that challenge.
Nowadays Cloud deployments are pervasive in businesses, with scalability and multi tenancy as their core capabilities. This means that these deployments can grow easily beyond 1000 nodes and efficient operation of these huge clusters requires real time log analysis, metrics, events and configuration data. Performing correlation and finding patterns, not just to get to root causes but also to predict failures and reduce risk requires tools that go beyond current solutions.
In the prototype developed by Red Hat and KEEDIO (keedio.com), we managed to address the above challenges with the use of Big Data tools like Apache NiFi, Apache Kafka and Apache Flink, that enabled us to process the constant stream of syslog messages (RFC5424) produced by the Infrastructure as a Service, provided by OpenStack services, and also detect common failure patterns that could arise and generate alerts as needed.
This session is an (Intermediate) talk in our Apache Nifi and Data Science track. It focuses on Apache Flink, Apache Nifi, Apache Kafka and is geared towards Architect, Data Scientist, Data Analyst, Developer / Engineer audiences.
Speaker
Miguel Perez Colino, Senior Design Product Manager, Red Hat
Suneel Marthi, Senior Principal Engineer, Red Hat
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
This document provides an overview and agenda for a presentation on Apache Flink. It begins with an introduction to Apache Flink and how it fits into the big data ecosystem. It then explains why Flink is considered the "4th generation" of big data analytics frameworks. Finally, it outlines next steps for those interested in Flink, such as learning more or contributing to the project. The presentation covers topics such as Flink's APIs, libraries, architecture, programming model and integration with other tools.
Building large scale applications in yarn with apache twillHenry Saputra
This document summarizes a presentation about Apache Twill, which provides abstractions for building large-scale applications on Apache Hadoop YARN. It discusses why Twill was created to simplify developing on YARN, Twill's architecture and components, key features like real-time logging and elastic scaling, real-world uses at CDAP, and the Twill roadmap.
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...Spark Summit
Notebooks: they enable our users, but they can cripple our clusters. Let’s fix that. Notebooks have soared in popularity at companies world-wide because they provide an easy, user-friendly way of accessing the cluster-computing power of Spark. But the more users you have hitting a cluster, the harder it is to manage the cluster resources as big, long-running jobs start to starve out small, short-running jobs. While you could have users spin up EMR-style clusters, this reduces the ability to take advantage of the collaborative nature of notebooks. It also quickly becomes expensive as clusters sit idle for long periods of time waiting on single users. What we want is fair, efficient resource utilization on a large single cluster for a large number of users. In this talk we’ll discuss dynamic allocation and the best practices for configuring the current version of Spark as-is to help solve this problem. We’ll also present new improvements we’ve made to address this use case. These include: decommissioning executors without losing cached data, proactively shutting down executors to prevent starvation, and improving the start times of new executors.
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesDataWorks Summit
Apache Druid supports auto-scaling of Middle Manager nodes to handle changes in data ingestion load. On Kubernetes, this can be implemented using Horizontal Pod Autoscaling based on custom metrics exposed from the Druid Overlord process, such as the number of pending/running tasks and expected number of workers. The autoscaler scales the number of Middle Manager pods between minimum and maximum thresholds to maintain a target average load percentage.
Microservices, Containers, and Machine LearningPaco Nathan
Session talk for Data Day Texas 2015, showing GraphX and SparkSQL for text analytics and graph analytics of an Apache developer email list -- including an implementation of TextRank in Spark.
Native Support of Prometheus Monitoring in Apache Spark 3.0Databricks
All production environment requires monitoring and alerting. Apache Spark also has a configurable metrics system in order to allow users to report Spark metrics to a variety of sinks. Prometheus is one of the popular open-source monitoring and alerting toolkits which is used with Apache Spark together.
This document discusses running Spark applications on YARN and managing Spark clusters. It covers challenges like predictable job execution times and optimal cluster utilization. Spark on YARN is introduced as a way to leverage YARN's resource management. Techniques like dynamic allocation, locality-aware scheduling, and resource queues help improve cluster sharing and utilization for multi-tenant workloads. Security considerations for shared clusters running sensitive data are also addressed.
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...Spark Summit
Today there are several compliance use cases — archiving, e-discovery, supervision + surveillance, to name a few — that appear naturally suited as Hadoop workloads but haven’t seen wide adoption. In this talk, we’ll discuss common limitations, how Apache Spark helps, and propose some new blueprints as to how to modernize this architecture and disrupt existing solutions. Additionally, we’ll discuss the rising role of Apache Spark in this ecosystem; leveraging machine learning and advanced analytics in a space that has traditionally been restricted to fairly rote reporting.
Apache Airavata is an open source science gateway software framework that allows users to compose, manage, execute, and monitor distributed computational workflows. It provides tools and services to register applications, schedule jobs on various resources, and manage workflows and generated data. Airavata is used across several domains to support scientific workflows and is largely derived from academic research funded by the NSF.
The document provides an overview of the Science Gateway Group at Indiana University. It introduces the group members and describes their focus areas as developing open source software for cyberinfrastructure like Apache Rave and Apache Airavata. It discusses the group's work on extending collaborations with application scientists in various domains. The document also outlines possibilities for collaboration with the PTI CREST Lab on topics like scientific workflows and generalized execution frameworks.
Este documento describe los factores que influyen en la determinación de los precios de un producto o servicio. Explica que el precio es una variable clave del marketing y depende de factores internos como los costos y objetivos de la empresa, y externos como la competencia, demanda del mercado y sensibilidad de los consumidores al precio. También analiza los diferentes enfoques para establecer políticas de precios y la importancia de considerar múltiples variables al fijar los precios.
This document provides an overview of Apache Airavata, an open source software framework for executing and managing computational jobs and workflows across different computing resources. It discusses Apache Airavata's architectural goals of being distributed, scalable, fault tolerant, secure, and component-based. The key components of Apache Airavata's architecture are described, including how it supports multiple gateways and job monitoring. The document also outlines some of Apache Airavata's security features and how new computational resources and clients can integrate with it.
The document describes the OGCE Workflow Toolkit and its applications for multi-scale science. It discusses the Data Capacitor storage solution and science gateways that provide tools and data access via portals. The LEAD weather gateway is used as an example, allowing access to radar data. The OGCE toolkit includes services like the workflow engine, registry, and event notification bus that enable flexible and extensible scientific workflows across resources like TeraGrid.
Cyberinfrastructure Experiences with Apache Airavatasmarru
In this short presentation, we summarize the Apache Airavata's use of component-based architecture to encompass major gateway capabilities (such as metadata management, meta-scheduling, execution management, and messaging).
Cask Webinar
Date: 08/10/2016
Link to video recording: https://www.youtube.com/watch?v=XUkANr9iag0
In this webinar, Nitin Motgi, CTO of Cask, walks through the new capabilities of CDAP 3.5 and explains how your organization can benefit.
Some of the highlights include:
- Enterprise-grade security - Authentication, authorization, secure keystore for storing configurations. Plus integration with Apache Sentry and Apache Ranger.
- Preview mode - Ability to preview and debug data pipelines before deploying them.
- Joins in Cask Hydrator - Capabilities to join multiple data sources in data pipelines
- Real-time pipelines with Spark Streaming - Drag & drop real-time pipelines using Spark Streaming.
- Data usage analytics - Ability to report application usage of data sets.
- And much more!
Web Scale Reasoning and the LarKC ProjectSaltlux Inc.
The LarKC project aims to build an integrated pluggable platform for large-scale reasoning. It supports parallelization, distribution, and remote execution. The LarKC platform provides a lightweight core that gives standardized interfaces for combining plug-in components, while the real work is done in the plug-ins. There are three types of LarKC users: those building plug-ins, configuring workflows, and using workflows.
This document provides an overview of Apache Apex and real-time data visualization. Apache Apex is a platform for developing scalable streaming applications that can process billions of events per second with millisecond latency. It uses YARN for resource management and includes connectors, compute operators, and integrations. The document discusses using Apache Apex to build real-time dashboards and widgets using the App Data Framework, which exposes application data sources via topics. It also covers exporting and packaging dashboards to include in Apache Apex application packages.
The Taverna Suite provides tools for interactive and batch workflow execution. It includes a workbench for graphical workflow construction, various client interfaces, and servers for multi-user workflow execution. The suite utilizes a plug-in framework and supports a variety of domains, infrastructures, and tools through custom plug-ins.
This document outlines the agenda and content for a presentation on xPatterns, a tool that provides APIs and tools for ingesting, transforming, querying and exporting large datasets on Apache Spark, Shark, Tachyon and Mesos. The presentation demonstrates how xPatterns has evolved its infrastructure to leverage these big data technologies for improved performance, including distributed data ingestion, transformation APIs, an interactive Shark query server, and exporting data to NoSQL databases. It also provides examples of how xPatterns has been used to build applications on large healthcare datasets.
- The presentation introduces WS-VLAM, a workflow management system that aims to enable end-users to define, execute, and monitor e-science applications in a location-independent way.
- WS-VLAM adopts a service-oriented approach, implementing the workflow engine and repository as WSRF services using Globus Toolkit 4. This allows for interoperability with other workflow systems.
- Current work involves testing on rapid prototyping environments and planned integration with Taverna and Kepler to allow executing predefined VLAM workflows from within those systems.
Apache Airavata is a system that allows scientists to automate computational experiments and workflows without manual intervention. It collects experiment data and parameters, executes applications and workflows on computational resources, and returns results while providing ongoing progress updates to the user. Airavata has four main components - a workflow interpreter to manage execution, a resource manager to control applications and data transfers, a registry to define available applications and store results, and a messaging system to communicate progress.
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...Provectus
Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing pipelines, and also data ingestion and integration flows, supporting for both batch and streaming use cases. In presentation I will provide a general overview of Apache Beam and programming model comparison Apache Beam vs Apache Spark.
Innovate2014 Better Integrations Through Open InterfacesSteve Speicher
- The document discusses open interfaces and integrated lifecycle tools through linked data and open standards like OSLC, taking inspiration from principles of the World Wide Web.
- It promotes using open protocols like REST and HTTP for tool integration instead of tight coupling, and outlines guidelines for using URIs, HTTP, and semantic standards like RDF and SPARQL to represent and share resource data on the web.
- OSLC is presented as a solution for lifecycle integration across requirements management, quality management, change management and other tools using common resource definitions and linked data over open APIs.
TDC Connections 2023 - A High-Speed Data Ingestion Service in Java Using MQTT...Juarez Junior
The document discusses a Java-based high-speed data ingestion service that can ingest data using several protocols including MQTT, AMQP, and STOMP. It introduces Reactive Streams Ingestion (RSI), a Java library that allows streaming and reactive ingestion of data into an Oracle database. The document also discusses using ActiveMQ and JMS messaging to consume messages and presents a sample project structure and architecture for a data ingestion application.
Spark Development Lifecycle at Workday - ApacheCon 2020Pavel Hardak
Presented by Eren Avsarogullari and Pavel Hardak (ApacheCon 2020)
https://www.linkedin.com/in/erenavsarogullari/
https://www.linkedin.com/in/pavelhardak/
Apache Spark is the backbone of Workday's Prism Analytics Platform, supporting various data processing use-cases such as Data Ingestion, Preparation(Cleaning, Transformation & Publishing) and Discovery. At Workday, we extend Spark OSS repo and build custom Spark releases covering our custom patches on the top of Spark OSS patches. Custom Spark release development introduces the challenges when supporting multiple Spark versions against to a single repo and dealing with large numbers of customers, each of which can execute their own long-running Spark Applications. When building the custom Spark releases and new Spark features, dedicated Benchmark pipeline is also important to catch performance regression by running the standard TPC-H & TPC-DS queries against to both Spark versions and monitoring Spark driver & executors' runtime behaviors before production. At deployment phase, we also follow progressive roll-out plan leveraged by Feature Toggles used to enable/disable the new Spark features at the runtime. As part of our development lifecycle, Feature Toggles help on various use cases such as selection of Spark compile-time and runtime versions, running test pipelines against to both Spark versions on the build pipeline and supporting progressive roll-out deployment when dealing with large numbers of customers and long-running Spark Applications. On the other hand, executed Spark queries' operation level runtime behaviors are important for debugging and troubleshooting. Incoming Spark release is going to introduce new SQL Rest API exposing executed queries' operation level runtime metrics and we transform them to queryable Hive tables in order to track operation level runtime behaviors per executed query. In the light of these, this session aims to cover Spark feature development lifecycle at Workday by covering custom Spark Upgrade model, Benchmark & Monitoring Pipeline and Spark Runtime Metrics Pipeline details through used patterns and technologies step by step.
Apache Spark Development Lifecycle @ Workday - ApacheCon 2020Eren Avşaroğulları
Workday uses Apache Spark as the foundational technology for its Prism Analytics product. It has developed a custom Spark upgrade model to handle upgrading Spark across its multi-tenant environment. Workday also collects runtime metrics on Spark SQL queries using a custom metrics pipeline and REST API. Future plans include upgrading to Spark 3.x and improving multi-tenancy support through a "Multiverse" deployment model.
Overview of Indiana University's Advanced Science Gateway support activities for drug discovery, computational chemistry, and other Web portals. For a broader overview of the OGCE project, see http://www.collab-ogce.org/ogce/index.php
Apache Eagle at Hadoop Summit 2016 San JoseHao Chen
Apache Eagle is a distributed real-time monitoring and alerting engine for Hadoop that was created by eBay and later open sourced as an Apache Incubator project. It provides security for Hadoop systems by instantly identifying access to sensitive data, recognizing attacks/malicious activity, and blocking access in real time through complex policy definitions and stream processing. Eagle was designed to handle the huge volume of metrics and logs generated by large-scale Hadoop deployments through its distributed architecture and linear scalability.
Apache Eagle is a distributed real-time monitoring and alerting engine for Hadoop that was created by eBay and later open sourced as an Apache Incubator project. It provides security for Hadoop systems by instantly identifying access to sensitive data, recognizing attacks/malicious activity, and blocking access in real time through complex policy definitions and stream processing. Eagle was designed to handle the huge volume of metrics and logs generated by large-scale Hadoop deployments through its distributed architecture and use of technologies like Apache Storm and Kafka.
This document summarizes a presentation on a Java library for high-speed streaming of data into databases. It discusses challenges with streaming data at scale and introduces Oracle's high speed streaming library, which uses direct path inserts and unified connection pooling to enable fast, scalable streaming. Code samples are provided for using the push and flow publisher APIs to ingest data streams into databases.
The IU Science Gateway Group supports the development of web-based scientific research tools and gateways. Led by Marlon Pierce and including several senior staff and interns, the group develops interfaces, workflows, and APIs. They foster sustainability through Apache projects like Airavata and Rave. The group collaborates widely and works to advance gateway computing through the Open Gateway Computing Environments partnership and XSEDE support activities.
This document discusses developing cyberinfrastructure to support computational chemistry workflows. It describes the OREChem project which aims to develop infrastructure for scholarly materials in chemistry. It outlines IU's objectives to build pipelines to fetch OREChem data, perform computations on resources like TeraGrid, and store results. It also discusses the GridChem science gateway which supports various chemistry applications and the ParamChem project which automates parameterization of molecular mechanics methods through workflows. Finally, it covers the Open Gateway Computing Environments project and efforts to sustain software through the Apache Software Foundation.
The document discusses Open Gateway Computing Environments (OGCE) and its software components. OGCE develops secure web-based science gateways for fields like chemistry, bioinformatics, biophysics, and environmental sciences. It is funded by the NSF. Key OGCE software includes the Gadget Container, GFAC for invoking scientific applications on grids and clouds, and workflow tools. Partners include Indiana University, NCSA, Purdue University, and UTHSCSA. The document provides examples of OGCE components in action, like UltraScan, GridChem, and BioVLAB. It also discusses building simple grid gadgets and computational chemistry workflows with GridChem.
This document summarizes the Open Grid Computing Environments (OGCE) project. It describes OGCE software tools like the Gadget Container, XBaya workflow composer, and GFAC application wrapper. It focuses on providing these tools to enable running science applications on grids and clouds. The tools can be used individually or together. OGCE outsources security and data services to providers like Globus, Condor, and iRods. It supports workflows like GridChem, UltraScan, and bioinformatics pipelines. The software is open source and available via anonymous SVN checkout.
OGCE Review for Indiana University Research Technologiesmarpierc
The document describes the Open Grid Computing Environments (OGCE) software suite and related activities. It provides an overview of various OGCE tools like the gadget container, XBaya workflow composer, and GFAC application wrapper service. It also summarizes collaborations with gateways like UltraScan, GridChem, and SimpleGrid to integrate OGCE tools and develop gateway components.
The OGCE team develops open source software for building secure science gateways in various domains like chemistry, bioinformatics, and environmental sciences. They are funded by the National Science Foundation to support the full lifecycle of gateway software development. Their software components enable web-based access to remote resources and tools.
This document provides an overview of the Open Grid Computing Environments (OGCE) project, including portals, services, workflows, gadgets, and tags they develop. It discusses how OGCE software is used in science gateways and contributes code back to these projects. It also summarizes upcoming and existing OGCE services, strategies for adopting web 2.0 technologies, examples of OGCE gadgets and integration with open social containers, and a plan to integrate these components for demonstration at SC09.
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
1) Cyberinfrastructure refers to the combination of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people that enable knowledge discovery through integrated multi-scale simulations and analyses.
2) Cloud computing, multicore processors, and Web 2.0 tools are changing the landscape of cyberinfrastructure by providing new approaches to distributed computing and data sharing that emphasize usability, collaboration, and accessibility.
3) Scientific applications are increasingly data-intensive, requiring high-performance computing resources to analyze large datasets from sources like gene sequencers, telescopes, sensors, and web crawlers.
GTLAB Installation Tutorial for SciDAC 2009marpierc
GTLAB is a Java Server Faces tag library that wraps Grid and web services to build portal-based and standalone applications. It contains tags for common tasks like job submission, file transfer, credential management. GTLAB applications can be deployed as portlets or converted to Google Gadgets. The document provides instructions for installing GTLAB, examples of tags, and making new custom tags.
The document provides an overview of the Open Grid Computing Environments (OGCE) project, which develops and packages software for science gateways and resources. Key components discussed include the OGCE portal for building grid portals, Axis services for resource discovery and prediction, a workflow suite, and JavaScript and tag libraries. The document describes downloading and installing the OGCE software, which can be done with a single command, and discusses some of the portlets, services, and components included in the OGCE toolkit.
The document provides an overview of OGCE (Open Grid Computing Environment), which develops and packages reusable software components for science portals. Key components described include services, gadgets, tags, and how they fit together. Installation and usage of the various OGCE components is discussed at a high level.
The document discusses installing and building GTLAB, which contains a Grid portal, workflow suite, web services, and gadget container. It can be checked out from SVN or downloaded as a TAR file. To build GTLAB, edit the pom.xml file, run mvn clean install, and start the Tomcat server. Examples are provided and users can create new JSF pages and tags.
WinRAR Crack for Windows (100% Working 2025)sh607827
copy and past on google ➤ ➤➤ https://hdlicense.org/ddl/
WinRAR Crack Free Download is a powerful archive manager that provides full support for RAR and ZIP archives and decompresses CAB, ARJ, LZH, TAR, GZ, ACE, UUE, .
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...Egor Kaleynik
This case study explores how we partnered with a mid-sized U.S. healthcare SaaS provider to help them scale from a successful pilot phase to supporting over 10,000 users—while meeting strict HIPAA compliance requirements.
Faced with slow, manual testing cycles, frequent regression bugs, and looming audit risks, their growth was at risk. Their existing QA processes couldn’t keep up with the complexity of real-time biometric data handling, and earlier automation attempts had failed due to unreliable tools and fragmented workflows.
We stepped in to deliver a full QA and DevOps transformation. Our team replaced their fragile legacy tests with Testim’s self-healing automation, integrated Postman and OWASP ZAP into Jenkins pipelines for continuous API and security validation, and leveraged AWS Device Farm for real-device, region-specific compliance testing. Custom deployment scripts gave them control over rollouts without relying on heavy CI/CD infrastructure.
The result? Test cycle times were reduced from 3 days to just 8 hours, regression bugs dropped by 40%, and they passed their first HIPAA audit without issue—unlocking faster contract signings and enabling them to expand confidently. More than just a technical upgrade, this project embedded compliance into every phase of development, proving that SaaS providers in regulated industries can scale fast and stay secure.
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMaxim Salnikov
Imagine if apps could think, plan, and team up like humans. Welcome to the world of AI agents and agentic user interfaces (UI)! In this session, we'll explore how AI agents make decisions, collaborate with each other, and create more natural and powerful experiences for users.
🌍📱👉COPY LINK & PASTE ON GOOGLE http://drfiles.net/ 👈🌍
Adobe Illustrator is a powerful, professional-grade vector graphics software used for creating a wide range of designs, including logos, icons, illustrations, and more. Unlike raster graphics (like photos), which are made of pixels, vector graphics in Illustrator are defined by mathematical equations, allowing them to be scaled up or down infinitely without losing quality.
Here's a more detailed explanation:
Key Features and Capabilities:
Vector-Based Design:
Illustrator's foundation is its use of vector graphics, meaning designs are created using paths, lines, shapes, and curves defined mathematically.
Scalability:
This vector-based approach allows for designs to be resized without any loss of resolution or quality, making it suitable for various print and digital applications.
Design Creation:
Illustrator is used for a wide variety of design purposes, including:
Logos and Brand Identity: Creating logos, icons, and other brand assets.
Illustrations: Designing detailed illustrations for books, magazines, web pages, and more.
Marketing Materials: Creating posters, flyers, banners, and other marketing visuals.
Web Design: Designing web graphics, including icons, buttons, and layouts.
Text Handling:
Illustrator offers sophisticated typography tools for manipulating and designing text within your graphics.
Brushes and Effects:
It provides a range of brushes and effects for adding artistic touches and visual styles to your designs.
Integration with Other Adobe Software:
Illustrator integrates seamlessly with other Adobe Creative Cloud apps like Photoshop, InDesign, and Dreamweaver, facilitating a smooth workflow.
Why Use Illustrator?
Professional-Grade Features:
Illustrator offers a comprehensive set of tools and features for professional design work.
Versatility:
It can be used for a wide range of design tasks and applications, making it a versatile tool for designers.
Industry Standard:
Illustrator is a widely used and recognized software in the graphic design industry.
Creative Freedom:
It empowers designers to create detailed, high-quality graphics with a high degree of control and precision.
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?steaveroggers
Migrating from Lotus Notes to Outlook can be a complex and time-consuming task, especially when dealing with large volumes of NSF emails. This presentation provides a complete guide on how to batch export Lotus Notes NSF emails to Outlook PST format quickly and securely. It highlights the challenges of manual methods, the benefits of using an automated tool, and introduces eSoftTools NSF to PST Converter Software — a reliable solution designed to handle bulk email migrations efficiently. Learn about the software’s key features, step-by-step export process, system requirements, and how it ensures 100% data accuracy and folder structure preservation during migration. Make your email transition smoother, safer, and faster with the right approach.
Read More:- https://www.esofttools.com/nsf-to-pst-converter.html
⭕️➡️ FOR DOWNLOAD LINK : http://drfiles.net/ ⬅️⭕️
Maxon Cinema 4D 2025 is the latest version of the Maxon's 3D software, released in September 2024, and it builds upon previous versions with new tools for procedural modeling and animation, as well as enhancements to particle, Pyro, and rigid body simulations. CG Channel also mentions that Cinema 4D 2025.2, released in April 2025, focuses on spline tools and unified simulation enhancements.
Key improvements and features of Cinema 4D 2025 include:
Procedural Modeling: New tools and workflows for creating models procedurally, including fabric weave and constellation generators.
Procedural Animation: Field Driver tag for procedural animation.
Simulation Enhancements: Improved particle, Pyro, and rigid body simulations.
Spline Tools: Enhanced spline tools for motion graphics and animation, including spline modifiers from Rocket Lasso now included for all subscribers.
Unified Simulation & Particles: Refined physics-based effects and improved particle systems.
Boolean System: Modernized boolean system for precise 3D modeling.
Particle Node Modifier: New particle node modifier for creating particle scenes.
Learning Panel: Intuitive learning panel for new users.
Redshift Integration: Maxon now includes access to the full power of Redshift rendering for all new subscriptions.
In essence, Cinema 4D 2025 is a major update that provides artists with more powerful tools and workflows for creating 3D content, particularly in the fields of motion graphics, VFX, and visualization.
Adobe After Effects Crack FREE FRESH version 2025kashifyounis067
🌍📱👉COPY LINK & PASTE ON GOOGLE http://drfiles.net/ 👈🌍
Adobe After Effects is a software application used for creating motion graphics, special effects, and video compositing. It's widely used in TV and film post-production, as well as for creating visuals for online content, presentations, and more. While it can be used to create basic animations and designs, its primary strength lies in adding visual effects and motion to videos and graphics after they have been edited.
Here's a more detailed breakdown:
Motion Graphics:
.
After Effects is powerful for creating animated titles, transitions, and other visual elements to enhance the look of videos and presentations.
Visual Effects:
.
It's used extensively in film and television for creating special effects like green screen compositing, object manipulation, and other visual enhancements.
Video Compositing:
.
After Effects allows users to combine multiple video clips, images, and graphics to create a final, cohesive visual.
Animation:
.
It uses keyframes to create smooth, animated sequences, allowing for precise control over the movement and appearance of objects.
Integration with Adobe Creative Cloud:
.
After Effects is part of the Adobe Creative Cloud, a suite of software that includes other popular applications like Photoshop and Premiere Pro.
Post-Production Tool:
.
After Effects is primarily used in the post-production phase, meaning it's used to enhance the visuals after the initial editing of footage has been completed.
Download YouTube By Click 2025 Free Full Activatedsaniamalik72555
Copy & Past Link 👉👉
https://dr-up-community.info/
"YouTube by Click" likely refers to the ByClick Downloader software, a video downloading and conversion tool, specifically designed to download content from YouTube and other video platforms. It allows users to download YouTube videos for offline viewing and to convert them to different formats.
Get & Download Wondershare Filmora Crack Latest [2025]saniaaftab72555
Copy & Past Link 👉👉
https://dr-up-community.info/
Wondershare Filmora is a video editing software and app designed for both beginners and experienced users. It's known for its user-friendly interface, drag-and-drop functionality, and a wide range of tools and features for creating and editing videos. Filmora is available on Windows, macOS, iOS (iPhone/iPad), and Android platforms.
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)Andre Hora
Software testing plays a crucial role in the contribution process of open-source projects. For example, contributions introducing new features are expected to include tests, and contributions with tests are more likely to be accepted. Although most real-world projects require contributors to write tests, the specific testing practices communicated to contributors remain unclear. In this paper, we present an empirical study to understand better how software testing is approached in contribution guidelines. We analyze the guidelines of 200 Python and JavaScript open-source software projects. We find that 78% of the projects include some form of test documentation for contributors. Test documentation is located in multiple sources, including CONTRIBUTING files (58%), external documentation (24%), and README files (8%). Furthermore, test documentation commonly explains how to run tests (83.5%), but less often provides guidance on how to write tests (37%). It frequently covers unit tests (71%), but rarely addresses integration (20.5%) and end-to-end tests (15.5%). Other key testing aspects are also less frequently discussed: test coverage (25.5%) and mocking (9.5%). We conclude by discussing implications and future research.
Join Ajay Sarpal and Miray Vu to learn about key Marketo Engage enhancements. Discover improved in-app Salesforce CRM connector statistics for easy monitoring of sync health and throughput. Explore new Salesforce CRM Synch Dashboards providing up-to-date insights into weekly activity usage, thresholds, and limits with drill-down capabilities. Learn about proactive notifications for both Salesforce CRM sync and product usage overages. Get an update on improved Salesforce CRM synch scale and reliability coming in Q2 2025.
Key Takeaways:
Improved Salesforce CRM User Experience: Learn how self-service visibility enhances satisfaction.
Utilize Salesforce CRM Synch Dashboards: Explore real-time weekly activity data.
Monitor Performance Against Limits: See threshold limits for each product level.
Get Usage Over-Limit Alerts: Receive notifications for exceeding thresholds.
Learn About Improved Salesforce CRM Scale: Understand upcoming cloud-based incremental sync.
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentShubham Joshi
A secure test infrastructure ensures that the testing process doesn’t become a gateway for vulnerabilities. By protecting test environments, data, and access points, organizations can confidently develop and deploy software without compromising user privacy or system integrity.
Discover why Wi-Fi 7 is set to transform wireless networking and how Router Architects is leading the way with next-gen router designs built for speed, reliability, and innovation.
Download Wondershare Filmora Crack [2025] With Latesttahirabibi60507
Copy & Past Link 👉👉
http://drfiles.net/
Wondershare Filmora is a video editing software and app designed for both beginners and experienced users. It's known for its user-friendly interface, drag-and-drop functionality, and a wide range of tools and features for creating and editing videos. Filmora is available on Windows, macOS, iOS (iPhone/iPad), and Android platforms.
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDinusha Kumarasiri
AI is transforming APIs, enabling smarter automation, enhanced decision-making, and seamless integrations. This presentation explores key design principles for AI-infused APIs on Azure, covering performance optimization, security best practices, scalability strategies, and responsible AI governance. Learn how to leverage Azure API Management, machine learning models, and cloud-native architectures to build robust, efficient, and intelligent API solutions
Adobe Lightroom Classic Crack FREE Latest link 2025kashifyounis067
🌍📱👉COPY LINK & PASTE ON GOOGLE http://drfiles.net/ 👈🌍
Adobe Lightroom Classic is a desktop-based software application for editing and managing digital photos. It focuses on providing users with a powerful and comprehensive set of tools for organizing, editing, and processing their images on their computer. Unlike the newer Lightroom, which is cloud-based, Lightroom Classic stores photos locally on your computer and offers a more traditional workflow for professional photographers.
Here's a more detailed breakdown:
Key Features and Functions:
Organization:
Lightroom Classic provides robust tools for organizing your photos, including creating collections, using keywords, flags, and color labels.
Editing:
It offers a wide range of editing tools for making adjustments to color, tone, and more.
Processing:
Lightroom Classic can process RAW files, allowing for significant adjustments and fine-tuning of images.
Desktop-Focused:
The application is designed to be used on a computer, with the original photos stored locally on the hard drive.
Non-Destructive Editing:
Edits are applied to the original photos in a non-destructive way, meaning the original files remain untouched.
Key Differences from Lightroom (Cloud-Based):
Storage Location:
Lightroom Classic stores photos locally on your computer, while Lightroom stores them in the cloud.
Workflow:
Lightroom Classic is designed for a desktop workflow, while Lightroom is designed for a cloud-based workflow.
Connectivity:
Lightroom Classic can be used offline, while Lightroom requires an internet connection to sync and access photos.
Organization:
Lightroom Classic offers more advanced organization features like Collections and Keywords.
Who is it for?
Professional Photographers:
PCMag notes that Lightroom Classic is a popular choice among professional photographers who need the flexibility and control of a desktop-based application.
Users with Large Collections:
Those with extensive photo collections may prefer Lightroom Classic's local storage and robust organization features.
Users who prefer a traditional workflow:
Users who prefer a more traditional desktop workflow, with their original photos stored on their computer, will find Lightroom Classic a good fit.
4. Audience Introductions
Tell us about your work or studies and
your reasons for taking the tutorial
Sign in and take survey
5. Role Description
Gateway
Developers and
Providers
Use the Airavata API through and
SDK in their favorite programming
language.
Airavata
Developers
Want to change Airavata
components, experiment with
different implementations.
Middleware
Developers
Want to extend Airavata to talk to
their middleware clients.
Resource
Providers
Want to configure Airavata to work
with their middleware.
6. What You Will Learn
• What a Science Gateway is and does.
• How to build a simple Science Gateway.
• How to integrate existing gateways with
XSEDE and other resources
• What it takes to run a production
gateway.
8. What You Can Take Away
• A “test-drive” PHP gateway
– Open source code in GitHub
– We’ll run SciGaP services for you
• OR: Gateway client SDKs for your
programming language of choice
– Integrate with your gateway
– We’ll run SciGaP services for you
• OR: Airavata source code
– Play, use, or develop and contribute back
10. A Science Gateway for Biophysical Analysis
Usage statistics (in SUs) for the UltraScan Gateway Jan-Dec 2013
There are over 70 institutions in 18 countries actively using UltraScan/LIMS
Implemented currently on 7 HPC platforms, including one commercial
installation (non-public). In 2013 over 1.5 Million service units provided by
XSEDE, Jülich and UTHSCSA.
Ongoing Projects:
Integration of SAXS/SANS modeling
Integration of Molecular Dynamics (DMD)
Integration of SDSC Gordon Supercomputer with GRAM5 and UNICORE
Integration of Multi-wavelength optics (500-1000 fold higher data density)
Development of Performance Prediction algorithms through datamining
Development of a meta scheduler for mass submissions (of MW data)
11. UltraScan Goals:
Provide highest possible resolution in the analysis – requires HPC
Offer a flexible approach for multiple optimization methods
Integrate a variety of HPC environments into a uniform submission framework
(Campus resources, XSEDE, International resources)
Must be easy to learn and use – users should not have to be HPC experts
Support a large number of users and analysis instances simultaneously
Support data sharing and collaborations
Easy installation, easy maintenance
Robust and secure multi-user/multi-role framework
Provide check-pointing and easy to understand error messages
Fast turnaround to support serial workflows (model refinement)
A Science Gateway for Biophysical Analysis
14. UltraScan and Gateway Patterns
• Users create computational packages (experiments)
– Data to be analyzed, analysis parameters and control files.
– Computing resources to use and computing parameters (#
of processors, wall time, memory, etc)
• Experiments are used to launch jobs.
• Running jobs are monitored
• User may access intermediate details
• Results from completed jobs (including failures) need
to be moved to permanent data management system
15. UltraScan and Gateway Patterns,
Continued
• UltraScan users are not just XSEDE users.
• UltraScan provides access to campus and
international resources as well as XSEDE.
• Gateways are user-centric federating layers
over grids, clouds, and no-middleware
solutions
16. PHP Gateway for Airavata (PGA)
Demo of and hands on with the Test
Drive Portal
http://test-drive.airavata.org/
PGA is a GTA
17. Application Field Scenario
AMBER Molecular
Dynamics
Production constant pressure and temperature
(NPT) MD run of alanine dipeptide.
Trinity Bioinformatics RNA-Seq De novo Assembly
GROMACS Molecular
Dynamics
Minimize the energy of a protein from the PDB.
LAMMPS Molecular
Dynamics
Atomistic simulation of friction.
NWChem Ab Initio
Quantum
Chemistry
Optimized geometry and vibrational frequencies of
a water molecule.
Quantum
Espresso
Ab Initio
Quantum
Chemistry
Self-consistent field calculations (that is,
approximate many-body wave function) for
aluminum.
WRF Mesoscale
Weather
Forecasting
US East Coast storm from 2000.
See https://github.com/SciGaP/XSEDE-Job-Scripts/ for more information
21. Getting the Demo Portal Code
• Demo Portal is available from GitHub
– Note: this may be moved to the Airavata Git Repo
– Look for announcements on the Airavata mailing lists
• Go to https://github.com/SciGaP/PHP-Reference-
Gateway
– Create a clone
• Contribute back with pull requests and patches!
23. Client Objects for Job Submission
Experiment Inputs
(DataObject)
User
Configuration
Computational
Resource
Experiment
Advanced Output
Handling
24. How to Create an Experiment
• Go to GitHub
– http://s.apache.org/php-clients
• Download the Airavata PHP client zip.
– https://dist.apache.org/repos/dist/dev/airavata/0.13/
RC0/apache-airavata-php-sdk-0.13-SNAPSHOT-
bin.tar.gz
– Look at createExperiment.php
– We also have C++
• If you prefer, we have training accounts on gw104
– Need SSH to access
25. More API Usage
PHP Sample What it shows
launchExperiment.php How to run an experiment.
getExperimentStatus.php How to check the status.
getExperimentOutput.php How to get the outputs.
26. Hands On with PHP Command-
Line Clients
Exercises with the clients.
Use the tar on your local machine or
else log into gw104
28. Behind the API
• The Airavata API server deposits Experiment
information in the Registry.
• API calls Orchestrator to validate and launch jobs.
• Orchestrator schedules the job, uses GFAC to
connect with a specific resource.
• Multiple resources across administrative domains
– Campus, XSEDE, etc
– This requires different access mechanisms
30. A Few Observations on Successful
Gateways
• Support familiar community applications.
• Make HPC systems easy for new user
communities who need HPC.
• Have champions who build and support the
community.
• Have a lot of common features.
31. SciGaP Goals: Improve sustainability by converging on a
single set of hosted infrastructure services
35. Airavata API and Apache Thrift
• We use Apache Thrift to define the API.
• Advantages of Thrift
– Supports well-defined, typed messages.
– Custom defined errors, exceptions
– Generators for many different programming
languages.
– Some shielding from API versioning problems.
• Downsides of Thrift
– No message headers in TCP/IP, so everything must be
explicitly defined in the API.
– Limits on object-oriented features (polymorphism)
36. More Information
• ApacheCon 2014 Presentation: “RESTLess
Design with Apache Thrift: Experiences from
Apache Airavata”
• http://www.slideshare.net/smarru/restless-
design-with-apache-thrift-experiances
38. Application Catalog Summary
• The Application Catalog includes descriptions
of computing resources and applications.
– It is part of the Registry.
• Call the API to add, delete, or modify a
resource or an application.
• The App Catalog API and data models are in
transition to Thrift.
– You can also still use legacy methods, which fully
featured.
40. Data Model Description
Compute
Resource
Everything you need to know about a
particular computing resource.
Application
Interface
Resource independent information
about an application.
Application
Deployment
Compute Resource-specific information
about an application.
Gateway
Resource Profile
Gateway-specific preferences for a
given computing resource.
41. Using the Application Catalog API
Walk through how to create, modify
applications and computing resources
Live examples with {lammps, gromacs,
espresso,…} on {stampede, trestles, br2}
http://s.apache.org/register-apps
44. Airavata’s Philosophy
• There a lots of ways to build Web interfaces for
Science Gateways.
– By Hand: PHP, Twitter Bootstrap, AngularJS, …
– Turnkey Frameworks: Drupal, Plone, Joomla, …
– Science Gateway Frameworks: the SDSC Workbench,
HUBzero, WS-PGrade
• Gateway developers should concentrate on
building interfaces that serve their community.
• And outsource the general purpose services to
Airavata.
45. Apache Airavata Components
Component Description
Airavata API Server Thrift-generated server skeletons of the API and data
models; directs traffic to appropriate components
Registry Insert and access application, host machine,
workflow, and provenance data.
Orchestrator Handles experiment execution request validation,
scheduling, and decision making; selects GFAC
instances that can fulfill a given request
Application Factory
Service (GFAC)
Manages the execution and monitoring of an
individual application.
Workflow Interpreter
Service
Execute the workflow on one or more resources.
Messaging System WS-Notification and WS-Eventing compliant
publish/subscribe messaging system for workflow
events
47. Getting Help
Help Description
Airavata
Developer List
Get in touch with Airavata developers. Good place for asking
general questions. Lots of automated traffic noise. See
apache.airavata.org/community/mailing-lists.html. Mailing list
is public and archived online.
Airavata Jira Official place for posting bugs, tasks, etc. Posts here also go to
the developer mailing list. See
https://issues.apache.org/jira/browse/airavata
Airavata users,
architects lists
Can also post here. Use “users” list for questions. The
architects’ list is for higher level design discussions.
Airavata Wiki
and Website
The Wiki is frequently updated but also may have obsolete
material. The Website should have time-independent
documentation.
What else would you like to see? Google
Hangouts, IRCs, YouTube, social networks?
Improvements to the Website?
48. Airavata and SciGaP
• SciGaP: Airavata as part of a multi-tenanted
Gateway Platform as a Service
• Goal: We run Airavata so you don’t have to.
• Challenges:
– Centralize system state
– Make Airavata more cloud friendly, elastic
49. “Apache” Means “Open”
Join the Airavata developer mailing list, get involved,
submit patches, contribute.
Use Give Back
50. Getting Involved, Contributing Back
• Airavata is open source,
open community software.
• Open Community: you can
contribute back
– Patches, suggestions, wiki
documentation, etc
• We reward contributors
– Committers: write access
to master Git repo
– Project Management
Committee members: full,
binding voting rights
51. Some Contribution Opportunities
Component Research Opportunities
Registry Better support for Thrift-generated objects; NoSQL and
other backend data stores; fault tolerance
Orchestrator Pluggable scheduling; load balancing and elasticity
GFAC ZooKeeper-like strategies for configuring and managing.
Messenger Investigate AMQP, Kafka, and other newer messaging
systems
Workflow
Interpreter
Alternative workflow processing engines.
Overall Message-based rather than direct CPI calls.
Airavata components (should) expose Component Programming Interfaces
(CPIs) that allow you to switch out implementations. GFAC is also designed to
be pluggable.
55. Building on the Tutorial Material
• Test-drive gateway will remain up.
• Go over the PHP samples.
• Get other code examples
– JAVA, C++, Python, …
• Contact the Airavata developer list for help.
56. Get Involved in Apache Airavata
• Join the architecture mailing
list.
• Join the developer list.
• All contributions welcome.
– Apache Software Foundation
owns the IP.
– Submit patches and earn
committership, PMC
membership.
57. Where Is Airavata 1.0?
• Airavata 1.0 will be
the stable version of
the API.
• Developer
community vote.
– Get involved
– September 2014
target
• Semantic versioning
58. Get Connected, Get Help
• Get Connected: Join the XSEDE Science
Gateway program community mailing list.
– Email [email protected]
– “subscribe gateways”
• Get Help: Get XSEDE extended collaborative
support for your gateway.
• Get Info: https://www.xsede.org/gateways-
overview
62. Downloading Airavata
Method Advantages Disadvantages
Clone the trunk through
GitHub or Apache’s Git
repo (mirrored)
Latest check-ins; for
developers.
May break; we (usually)
have frequent releases
that are more stable.
Download official binary
release from Apache
Well tested; should
work out of the box.
Won’t have latest
features
Download official source
from Apache and build
Well tested; you can
make local
modifications.
You still need to build it.
Current Airavata release is 0.11. Version 0.12 will be the first Thrift-based
release, will be made in mid-June. We will return to a regular 6-8 week
release stride.
Airavata as a Service is under development through the NSF-funded SciGaP
Project. No download required.
63. Building Airavata from Source
• You need Apache Maven.
• Use “mvn clean install”
• Would you like to see other packaging?
–Pre-configured VMs?
–PaaS tools like OpenShift, Apache Stratos?
–Docker?
64. Running Airavata
• Modify airavata-server.properties (optional)
– airavata/modules/distribution/server/target/apac
he-airavata-server-{version}/bin/ if building from
source.
– Typically required to specify the XSEDE community
credential.
• Start the server using airavata-server.sh.
– Same location as airavata-server.properties
66. Generating New Thrift Clients
• airavata/airavata-api/generate-thrift-
files.sh
• Shows how to do this for Java, C++, and
PHP
• Caveat: requires Thrift installation
• See airavata/airavata-api/thrift-interface-
descriptions/ for our Thrift definitions
67. Writing a PHP Interface to the
Airavata API
This is a set of slides that summarizes
what you need to do to call the API
from PHP.
68. Anatomy of the PHP Command Lines
• Imports of Thrift Libraries
– require_once $GLOBALS['THRIFT_ROOT'] .
'Transport/TTransport.php';
– An so on
• Make sure the path to the libraries
(‘THRIFT_ROOT’) is correct.
• The libs are generated from the IDL.
69. Importing Airavata Libraries
• Import of Airavata API
– require_once $GLOBALS['AIRAVATA_ROOT'] .
'API/Airavata.php';
• Import data models
– require_once $GLOBALS['AIRAVATA_ROOT'] .
'Model/Workspace/Types.php';
– require_once $GLOBALS['AIRAVATA_ROOT'] .
'Model/Workspace/Experiment/Types.php’;
• These are 1-1 mappings from the our Thrift API
definition files.
70. Initialize the AiravataClient
• AiravataClient has all the API functions.
• You can configure the client manually or
through the AiravataClientFactory object.
– The examples do this manually
• We use Binary protocol and TSocket for wire
protocol and transport
• We recommend giving the client a long
timeout (5 seconds or more).
71. Make the API Calls
• $airavataclient = new
AiravataClient($protocol);
• $project=$airavataclient->getProject(‘user1’)
• And so on.
• Put these in try/catch blocks and catch
Airavata exceptions.
• Thrift exceptions will also be thrown.
• $transport->close() when done
73. Plan: Ask the Experts
• We are collaborating with Von Welch’s Center
for Trustworthy Cyberinfrastructure (CTSC)
– Von, Jim Basney, Randy Heiland
• This is an ongoing review
• Open issues are
– Thrift over TCP/IP or Thrift over HTTPS?
– Verifying user identity, gateway identity, and user
roles.
• Airavata architecture mailing lists
74. AuthN, AuthZ, and the Client API
• Airavata assumes a trust relationship with
gateways.
– Best practice: use SSL, mutual authentication.
– Firewall rules: clients come from known IPs.
• The gateway makes AuthN and AuthZ decisions
about its users.
• Airavata trusts the gateway’s assertions
• For command line examples, you are the gateway,
not the user
75. Auth{Z,N} Problems and Solutions
• Client-Airavata security is turned off for demo
• The approach does not scale.
– Clients may not have well-known IP addresses
(Desktop clients for example).
– Airavata service operators serving multiple
gateways have to manually configure
• Not self-service
• We are evaluating a developer key scheme.
– Modeled after Evernote client SDKs
76. More on Airavata Security
• “A Credential Store for Multi-tenant Science
Gateways”, Kanewala, Thejaka Amilia; Marru,
Suresh; Basney, Jim; Pierce, Marlon
– Proceedings of CCGrid 2014 (this conference!)
• http://hdl.handle.net/2022/17379
78. Airavata and Middleware
• Airavata’s philosophy is that gateways need ad-hoc access
to all kinds of resources that are not centrally managed.
– Grids, clouds, campus clusters
• These resources are accessed through many different
mechanisms.
– SSH, JSDL/BES, HTCondor, Globus, …
• These resources may also run job management systems
– PBS, SLURM
– Workflow engines
– Parameter sweep tools
– Hadoop and related tools
• So we’ll describe how to extend Airavata’s GFAC
79. GFAC Plugins: Providers and Handlers
• Providers: clients to common middleware.
– Airavata comes with SSH, Globus GRAM, BES/JSDL,
“localhost”, EC2 and other providers.
• Handlers: customize providers for specific resources.
– Set-up, stage-in, stage-out, and tear-down operations.
• A given GFAC invocation involves 1 Provider and 0 or
more Handlers.
• Gfac-core invokes Handlers in sequence before (“in-
handlers”) and after (“out-handlers”) the provider.
80. A Simple Example
• Let’s write two handlers and a localhost
provider.
• Prerequisite: you have Airavata installed
• InHandler: emails user
• Provider: runs “echo” via local execution
• OutHandler: emails user
81. Configure the InHandler
• Place this in gfac-config.xml.
• Note gfac-core lets you provide arbitrary properties.
<Handler
class="org.apache.airavata.gfac.local.handler.InputEmailHandler
">
<property name="username" value="gmail-
[email protected]"/>
<property name="password" value="gmail-password-xxx"/>
<property name="mail.smtp.auth” value="true"/>
<property name="mail.smtp.starttls.enable" value="true"/>
<property name="mail.smtp.host” value="smtp.gmail.com"/>
<property name="mail.smtp.port" value="587"/>
</Handler>
82. Configure the OutHandler
These are read by the following code. initProperties is a
GFacHandler interface method.
<Handler class="org.apache.airavata.gfac.local.handler.OutputEmailHandler">
<property name="username" value="[email protected]"/>
<property name="password" value="gmail-password-xxx"/>
<property name="mail.smtp.auth” value="true"/>
<property name="mail.smtp.starttls.enable" value="true"/>
<property name="mail.smtp.host” value="smtp.gmail.com"/>
<property name="mail.smtp.port" value="587"/>
</Handler>
private Properties props;
public void initProperties(Properties properties) throws
GFacHandlerException {
props = properties;
}
88. Quick Thanks to Devs and Testers
• Eroma Abeysinghe
• Lahiru Gunathilake
• Yu Ma
• Raminder Singh
• Saminda Wijeratne
• Chathuri Wimalasena
• Sachith Withana
• Some may join via Google Hangout
91. Why Use Airavata for Gateways
• Open community, not just open source.
– Become a stakeholder, have a say.
• Gateway-centric overlay federation.
– Not tied to specific infrastructure.
• No additional middleware required.
– Campus clusters
– Let your sysadmins know
• Stable funding
• Lots of room for distributed computing research
– Pluggable services
92. Why Use SciGaP?
• Outsource common services, concentrate on
community building.
• Use same services as UltraScan, CIPRES, NSG
• Open deployments.
– No secrets about how we run services
• Lots of eyes on the services.
94. What’s On the Way?
• Airavata 1.0 targeted for September
– 0.13, 0.14 will follow in the next 2 months.
• OpenStack service hosting
– Rackspace has donated $2000/month usage.
• API production usage by UltraScan
– Have been using Airavata for last 9 months
• CIPRES integration
– NSG has some advanced requirements
• Continued improvement to PGA Portal
– Java, Python, and other GTAs welcome
• More fun with Identity Server and API
• Desktop SDKs
95. What is Analytical Ultracentrifugation (AUC)?
A technique to separate macromolecules dissolved in solution in a centrifugal force
field. AUC can “watch” the molecules while they are separating and identify and
differentiate them by their hydrodynamic properties and partial concentration. AUC
is a “first Principle” method, which does NOT require standards.
How does AUC work?
By applying a very large centrifugal force, molecules are separated based on their
molar mass and their shape. The molecules are observed using different optical
systems that detect different properties of the molecules, such as refractive index,
UV or visible absorbance, or fluorescent light emission.
What molecules can be studied?
Virtually any molecule, colloid or particle that can be dissolved in a liquid can be
measured by AUC, as long as it does not sediment by gravity alone. The molecule
or particle can be as small as a salt ion, or as large as an entire virus particle.
The ultracentrifuge can spin at 60,000 rpm, generating forces close to 300,000 x g.
Even small molecules like salt ions will sediment in such a force.
AUC Background
96. Simplified access to
phylogenetics codes
on powerful XSEDE
resources
The CIPRES Gateway has
allowed 8600+ users to
access 45.7 million core
hours of XSEDE time for
260,000 jobs, resulting in
700+ publications.
97. Easy user interface – providing easy model
upload, running of codes
Complete set of neuronal simulation tools –
NEURON, GENESIS, Brian, NEST, PyNN –
widely used by computational neuroscientists
Ability to easily get to the results, download
results
Democratize computational
neuroscience
The NSG is a simple and secure online science portal that provides access to computational
neuroscience codes on XSEDE HPC resources
http://www.nsgportal.org
Amit Majumdar (PI), Maryann Martone (Co-PI), Subha Sivagnanm (Sr. Personnel)
Kenneth Yoshimoto (Sr. Personnel), Anita Bandrowski (Sr. Personnel), Vadim Astakhov UCSD
Ted Carnevale (PI), Yale School of Medicine
NSF Awards: ABI #1146949; ABI #1146830
98. The UltraScan science gateway supports
high performance computing analysis of
biophysics experiments using XSEDE,
Juelich, and campus clusters.
Desktop analysis tools
are integrated with the
Web portal.
Launch analysis
and monitor
through a
browser
We can help you build
gateways for your lab or
facility.