In recent years we have seen explosion of languages which run on Java Virtual Machine. We also have seen existing languages getting their implementations being rewritten to JVM. With all of the above we have seen rapid development of tools like parsers, bytecode generators and such, even inside JVM we saw initiatives like Da Vinci Machine Project, which led to invoke dynamic in JDK 7 and recent development of Graal and Truffle projects.
Is it really hard to write new programming language running on JVM? Even if you are not going to write your own I think it is worth to understand how your favorite language runs undercover, how early decisions can impact language extensibility and performance, what JVM itself and JVM ecosystem has to offer to language implementors.
During the session I will try to get you familiar with options you have when choosing parsers and byte code manipulation libraries. which language implementation to consider, how to test and tune your "new baby". Will you be able after this session to develop new and shiny language, packed with killer features language? No. But for sure you will understand difference between lexers and parsers, how bytecode works, why invoke dynamic and Graal and Truffle are so important to the future of JVM platform. Will we have time to write simple, compiled language?
This document provides 10 tips for improving Perl performance. Some key tips include using a profiler like Devel::NYTProf to identify bottlenecks, optimizing database queries with DBI, choosing fast hash storage like BerkeleyDB, avoiding serialization with Data::Dumper in favor of faster options like JSON::XS, and considering compiling Perl without threads for a potential 15% speed boost. Proper use of profiling is emphasized to avoid wasting time optimizing the wrong parts of code.
Jan Stępień - GraalVM: Fast, Polyglot, Native - Codemotion Berlin 2018Codemotion
GraalVM challenges the status quo on the JVM. This newly-released JIT compiler brings substantial speed improvements and support for polyglot applications. It also allows us to translate our JVM bytecode into self-contained native binaries. In this session we’ll explore GraalVM’s potential. We’ll focus on Clojure, but our discussion will apply to many more programming languages. We’ll use GraalVM to build small native binaries. We’ll discuss the method’s limitations and their impact. Finally, we’ll build complete Clojure web apps weighing a fraction of their traditional JVM incarnations.
The presentation describes how to install the NLTK and work out the basics of text processing with it. The slides were meant for supporting the talk and may not be containing much details.Many of the examples given in the slides are from the NLTK book (http://www.amazon.com/Natural-Language-Processing-Python-Steven/dp/0596516495/ref=sr_1_1?ie=UTF8&s=books&qid=1282107366&sr=8-1-spell ).
This document provides examples of common SQL anti-patterns and related NoSQL alternatives. It discusses issues like using tables as trees, caches, queues, or logs. It also addresses dynamic schema/table creation, stored procedures, row padding, complex joins, and object-relational mismatches. The document recommends alternatives like document databases, key-value stores, message brokers, and denormalization. It includes examples of modeling book data in MongoDB and Redis.
Large scale nlp using python's nltk on azurecloudbeatsch
This presentation provides an introduction to natural language processing (nlp) using python's natural language toolkit (nltk). Furthermore it describes how to run python (and more specifically nltk) as an elastic webjob on Azure.
Dicas e truques de otimização de websites pythonFabiano Weimar
This document discusses optimizations that can be made to Python websites to improve performance. It begins by showing benchmark tests of the default Zope and Plone installations, which are slow. Installing Squid as a caching proxy improves performance slightly. Further optimizations like adding the CacheFu caching application and configuring caching rules reduces response times dramatically under load testing. The document concludes with tips like profiling HTTP headers, using browser caching, and developer tools to optimize Python websites.
The document discusses how Groovy provides a simpler and more concise way to work with Java code for tasks like file input/output, XML parsing, and configuration compared to Java and other languages like Perl and Ruby. It highlights Groovy features like built-in support for closures, the Elvis operator, date formatting, and ExpandoMetaClass that allow for more readable and expressive code.
The document discusses the Rust compiler and borrow checker. It provides an overview of the stages of compilation including lexical analysis, parsing, semantic analysis, optimization, and code generation. It then does a deep dive on the borrow checker, explaining that it tracks variable initializations and moves in order to catch errors like attempting to use a variable after it has been moved.
Python Refactoring with Rope and Traad – The rope library is a powerful tool for refactoring Python code, but to be truly useful it needs to be available to development environments. Traad is a tool which makes it simpler to integrate rope into nearly any tool by exposing a simple HTTP API. In this session we’ll look at how traad and rope work together, and we’ll see how traad integrates with at least one popular editor.
UnQLite is an embedded key-value and document-oriented database with a simple API similar to SQLite. It uses a BSD license and supports cross-platform usage. Benchmarks show it has comparable or better performance than SQLite, Berkeley DB and other databases for common operations like storing, fetching, and iterating over large amounts of data. The developer is working on adding new storage engines to UnQLite.
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0Zabbix
Are you paranoid? Even if you are not, it might be a good idea to encrypt your email. Important documents. Communication. While monitoring data is not secret in many cases, transmitting it in plaintext over Internet does make some people nervous. And now, with Zabbix 3.0, there's a built-in way to encrypt communication between components, including Zabbix server, proxy, agent, get and sender. Yes, them all. In this short talk we'll learn about the available modes, supported libraries and how to configure it all to still make sense a few years later.
Zabbix Conference 2015
Modern javascript localization with c-3po and the good old gettextAlexander Mostovenko
This document summarizes a presentation about localization in modern JavaScript applications using GNU gettext. Some key points:
- GNU gettext is recommended over ICU due to better tooling and compatibility with existing backend formats.
- C-3po is an open source library that improves on gettext by allowing extraction and resolution of translations directly from JavaScript code using tagged template literals.
- It implements an extraction/merge/resolve workflow that allows developers and translators to work independently and precompiles translations for faster loading.
Hecl is a scripting language designed for mobile development that aims to be tiny, flexible, and simple; it is interpreted and dynamically typed, and can be embedded in Java applications to add scripting capabilities; the document discusses Hecl's features, syntax, embedding process, and how extensions can be created by writing Java classes that implement new commands.
Introduction to source{d} Engine and source{d} Lookout source{d}
Join us for a presentation and demo of source{d} Engine and source{d} Lookout. Combining code retrieval, language agnostic parsing, and git management tools with familiar APIs parsing, source{d} Engine simplifies code analysis. source{d} Lookout, a service for assisted code review that enables running custom code analyzers on GitHub pull requests.
The document discusses PHP streams. It defines a stream as a resource that exhibits a flow or succession of data. A wrapper tells a stream how to handle specific protocols and encodings. A context is a set of parameters and options that tell a stream or filter how to behave. Common built-in PHP streams include file, http, and ftp streams. Filters perform operations on stream data and can be used to modify stream contents.
DevConf 2016
"Развитие ветки PHP-7", Дмитрий Стогов (Zend Technologies)
Я расскажу о внутреннем устройстве PHP-7.0, изменениях готовящихся в PHP-7.1 и планах на PHP-7.2.
This short presentation draws on the computational complexity of Perl 5 regexes, the experimental fetures introduced to P5 later on and the pattern expression grammars in Perl 6. It shows some examples of how PEGs can be used for data exploratory parsing.
The Ring programming language version 1.10 book - Part 49 of 212Mahmoud Samir Fayed
The Natural Library allows defining natural languages and commands that can be executed from natural text. Key methods include SetLanguageName(), SetCommandsPath(), UseCommand(), and RunString(). Commands are defined as classes specifying the syntax using methods like SyntaxIsKeyword() and SyntaxIsCommand(). This allows commands to be executed from natural text by parsing the keywords and arguments.
The document discusses using Protocol Buffers for network protocol design between web games. It provides an example of defining a message format in Protobuf, compiling it, and then reading and writing messages in Python. It also discusses using Protocol Buffers between Python and ActionScript clients and servers, including designing a header message to identify different message types and handling message dispatching.
The Ring programming language version 1.5.3 book - Part 39 of 184Mahmoud Samir Fayed
The Natural Library allows defining natural languages in Ring with just a few lines of code. It provides classes to (1) define a natural language, (2) set commands and operators, and (3) run natural code from files or strings. Commands are defined as classes specifying syntax and functionality. The library parses input and calls command classes. This allows quickly building domain-specific languages for tasks like data entry through natural language.
The document provides an overview of a presentation given by Stephan Schmidt on connecting PHP and JavaScript using JSON-RPC. Some key points:
- It discusses the classic web application model and how business logic resides solely on the server
- With Web 2.0, presentation logic moved to the client but business logic still resides on the server
- The remote proxy pattern can be used to expose server-side business logic as JavaScript objects, making remote calls transparent to the client
- This is done by serializing calls to JSON and making HTTP requests to a JSON-RPC server implemented in PHP
- The server uses reflection to dynamically call the relevant PHP methods and return responses also serialized to JSON
This document discusses using Gradle for building projects in multiple languages. Gradle's domain specific language is based on Groovy, which allows for concise syntax. Gradle supports building Java, C++, Ruby, and other languages through plugins. It can also be used to build documentation and publish artifacts to repositories. Migrating from other build systems like Ant, Maven, and Make to Gradle is also discussed.
Linux kernel TLS и HTTPS / Александр Крижановский (Tempesta Technologies)Ontico
HighLoad++ 2017
Зал «Москва», 7 ноября, 11:00
Тезисы:
http://www.highload.ru/2017/abstracts/3018.html
Наверное, уже ни для кого не секрет, что в Linux kernel интегрируется поддержка TLS: он уже есть в текущем RC Linux 4.13.
В докладе я хочу рассказать, зачем вносится TLS в ядро Linux и о подходах к Linux kernel TLS от Facebook/RedHat, Mellanox и нашего проекта Tempesta FW. Также рассажу о специфичных для ядра проблемах реализации TLS.
...
Slides for my talk at BCN WordCamp 2016. Improve the performance of WordPress installations by using the right tool at every corresponding level in the technology stack.
Introduction to Groovy (Serbian Developer Conference 2013)Joachim Baumann
The document provides an overview of the Groovy programming language. It discusses key features of Groovy including that everything in Groovy is an object, its support for closures as first-class functions, and dynamic typing. Examples are given demonstrating how Groovy code is more concise than Java for common tasks like XML parsing, building a simple web server, and creating domain specific languages.
Crash Course in Natural Language Processing (2016)Vsevolod Dyomkin
This document provides an overview of natural language processing (NLP) including:
1. An introduction to NLP and its intersection with computational linguistics, computer science, and statistics.
2. A discussion of common NLP problems like tokenization, tagging, parsing, and their rule-based and statistical approaches.
3. An explanation of machine learning techniques for NLP like language models, naive Bayes classifiers, and dependency parsing.
4. Steps for developing an NLP system including translating requirements, experimentation, and going to production.
This document discusses practical aspects of natural language processing (NLP) work. It contrasts research work, which involves setting goals, devising algorithms, training models, and testing accuracy, with development work, which focuses on implementing algorithms as scalable APIs. The document emphasizes that obtaining data is crucial for NLP and describes sources for structured, semi-structured, and unstructured data. It recommends Lisp as a language that supports the interactivity, flexibility, and tree processing needed for NLP research and development work.
The document discusses how Groovy provides a simpler and more concise way to work with Java code for tasks like file input/output, XML parsing, and configuration compared to Java and other languages like Perl and Ruby. It highlights Groovy features like built-in support for closures, the Elvis operator, date formatting, and ExpandoMetaClass that allow for more readable and expressive code.
The document discusses the Rust compiler and borrow checker. It provides an overview of the stages of compilation including lexical analysis, parsing, semantic analysis, optimization, and code generation. It then does a deep dive on the borrow checker, explaining that it tracks variable initializations and moves in order to catch errors like attempting to use a variable after it has been moved.
Python Refactoring with Rope and Traad – The rope library is a powerful tool for refactoring Python code, but to be truly useful it needs to be available to development environments. Traad is a tool which makes it simpler to integrate rope into nearly any tool by exposing a simple HTTP API. In this session we’ll look at how traad and rope work together, and we’ll see how traad integrates with at least one popular editor.
UnQLite is an embedded key-value and document-oriented database with a simple API similar to SQLite. It uses a BSD license and supports cross-platform usage. Benchmarks show it has comparable or better performance than SQLite, Berkeley DB and other databases for common operations like storing, fetching, and iterating over large amounts of data. The developer is working on adding new storage engines to UnQLite.
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0Zabbix
Are you paranoid? Even if you are not, it might be a good idea to encrypt your email. Important documents. Communication. While monitoring data is not secret in many cases, transmitting it in plaintext over Internet does make some people nervous. And now, with Zabbix 3.0, there's a built-in way to encrypt communication between components, including Zabbix server, proxy, agent, get and sender. Yes, them all. In this short talk we'll learn about the available modes, supported libraries and how to configure it all to still make sense a few years later.
Zabbix Conference 2015
Modern javascript localization with c-3po and the good old gettextAlexander Mostovenko
This document summarizes a presentation about localization in modern JavaScript applications using GNU gettext. Some key points:
- GNU gettext is recommended over ICU due to better tooling and compatibility with existing backend formats.
- C-3po is an open source library that improves on gettext by allowing extraction and resolution of translations directly from JavaScript code using tagged template literals.
- It implements an extraction/merge/resolve workflow that allows developers and translators to work independently and precompiles translations for faster loading.
Hecl is a scripting language designed for mobile development that aims to be tiny, flexible, and simple; it is interpreted and dynamically typed, and can be embedded in Java applications to add scripting capabilities; the document discusses Hecl's features, syntax, embedding process, and how extensions can be created by writing Java classes that implement new commands.
Introduction to source{d} Engine and source{d} Lookout source{d}
Join us for a presentation and demo of source{d} Engine and source{d} Lookout. Combining code retrieval, language agnostic parsing, and git management tools with familiar APIs parsing, source{d} Engine simplifies code analysis. source{d} Lookout, a service for assisted code review that enables running custom code analyzers on GitHub pull requests.
The document discusses PHP streams. It defines a stream as a resource that exhibits a flow or succession of data. A wrapper tells a stream how to handle specific protocols and encodings. A context is a set of parameters and options that tell a stream or filter how to behave. Common built-in PHP streams include file, http, and ftp streams. Filters perform operations on stream data and can be used to modify stream contents.
DevConf 2016
"Развитие ветки PHP-7", Дмитрий Стогов (Zend Technologies)
Я расскажу о внутреннем устройстве PHP-7.0, изменениях готовящихся в PHP-7.1 и планах на PHP-7.2.
This short presentation draws on the computational complexity of Perl 5 regexes, the experimental fetures introduced to P5 later on and the pattern expression grammars in Perl 6. It shows some examples of how PEGs can be used for data exploratory parsing.
The Ring programming language version 1.10 book - Part 49 of 212Mahmoud Samir Fayed
The Natural Library allows defining natural languages and commands that can be executed from natural text. Key methods include SetLanguageName(), SetCommandsPath(), UseCommand(), and RunString(). Commands are defined as classes specifying the syntax using methods like SyntaxIsKeyword() and SyntaxIsCommand(). This allows commands to be executed from natural text by parsing the keywords and arguments.
The document discusses using Protocol Buffers for network protocol design between web games. It provides an example of defining a message format in Protobuf, compiling it, and then reading and writing messages in Python. It also discusses using Protocol Buffers between Python and ActionScript clients and servers, including designing a header message to identify different message types and handling message dispatching.
The Ring programming language version 1.5.3 book - Part 39 of 184Mahmoud Samir Fayed
The Natural Library allows defining natural languages in Ring with just a few lines of code. It provides classes to (1) define a natural language, (2) set commands and operators, and (3) run natural code from files or strings. Commands are defined as classes specifying syntax and functionality. The library parses input and calls command classes. This allows quickly building domain-specific languages for tasks like data entry through natural language.
The document provides an overview of a presentation given by Stephan Schmidt on connecting PHP and JavaScript using JSON-RPC. Some key points:
- It discusses the classic web application model and how business logic resides solely on the server
- With Web 2.0, presentation logic moved to the client but business logic still resides on the server
- The remote proxy pattern can be used to expose server-side business logic as JavaScript objects, making remote calls transparent to the client
- This is done by serializing calls to JSON and making HTTP requests to a JSON-RPC server implemented in PHP
- The server uses reflection to dynamically call the relevant PHP methods and return responses also serialized to JSON
This document discusses using Gradle for building projects in multiple languages. Gradle's domain specific language is based on Groovy, which allows for concise syntax. Gradle supports building Java, C++, Ruby, and other languages through plugins. It can also be used to build documentation and publish artifacts to repositories. Migrating from other build systems like Ant, Maven, and Make to Gradle is also discussed.
Linux kernel TLS и HTTPS / Александр Крижановский (Tempesta Technologies)Ontico
HighLoad++ 2017
Зал «Москва», 7 ноября, 11:00
Тезисы:
http://www.highload.ru/2017/abstracts/3018.html
Наверное, уже ни для кого не секрет, что в Linux kernel интегрируется поддержка TLS: он уже есть в текущем RC Linux 4.13.
В докладе я хочу рассказать, зачем вносится TLS в ядро Linux и о подходах к Linux kernel TLS от Facebook/RedHat, Mellanox и нашего проекта Tempesta FW. Также рассажу о специфичных для ядра проблемах реализации TLS.
...
Slides for my talk at BCN WordCamp 2016. Improve the performance of WordPress installations by using the right tool at every corresponding level in the technology stack.
Introduction to Groovy (Serbian Developer Conference 2013)Joachim Baumann
The document provides an overview of the Groovy programming language. It discusses key features of Groovy including that everything in Groovy is an object, its support for closures as first-class functions, and dynamic typing. Examples are given demonstrating how Groovy code is more concise than Java for common tasks like XML parsing, building a simple web server, and creating domain specific languages.
Crash Course in Natural Language Processing (2016)Vsevolod Dyomkin
This document provides an overview of natural language processing (NLP) including:
1. An introduction to NLP and its intersection with computational linguistics, computer science, and statistics.
2. A discussion of common NLP problems like tokenization, tagging, parsing, and their rule-based and statistical approaches.
3. An explanation of machine learning techniques for NLP like language models, naive Bayes classifiers, and dependency parsing.
4. Steps for developing an NLP system including translating requirements, experimentation, and going to production.
This document discusses practical aspects of natural language processing (NLP) work. It contrasts research work, which involves setting goals, devising algorithms, training models, and testing accuracy, with development work, which focuses on implementing algorithms as scalable APIs. The document emphasizes that obtaining data is crucial for NLP and describes sources for structured, semi-structured, and unstructured data. It recommends Lisp as a language that supports the interactivity, flexibility, and tree processing needed for NLP research and development work.
This document provides principles and guidelines for information presentation and effective communication for a TEDx event. It recommends focusing content on the event website, outsourcing social media to platforms like Facebook and YouTube, and using brief informative formats with images. For speaker presentations, it suggests including a short bio, topic, and reason to listen. The document stresses consistency, regularity, details to differentiate, and focusing on engaging a passionate audience rather than everyone.
This document provides an overview of Lisp Machines and the Genera operating system. It discusses that Lisp Machines had specialized hardware for Lisp data types and features like garbage collection to optimize for Lisp. It also describes that Genera had an open, extensible architecture with data-level integration where all code and data existed in a single shared memory space. Key concepts of Genera included extensibility, reusability, and transparency where the entire system was inspectable and modifiable.
This document discusses Lisp and its uses as a universal wrapper language. Some key points:
- Lisp makes complex things accessible and simple things fall into place on their own. Its core principles include everything being an expression, everything being a first-class citizen, programs being living entities, and its core consisting of 25 basic forms that can be customized.
- Lisp is a computationally oriented language. Examples are given of using Lisp to wrap SQL, build a poor man's ORM, wrap "black box" services like Redis, and implement complex algorithms like CKY parsing.
- Resources and tools for learning Lisp are provided, and questions about why, what for, and who Lisp is
This document discusses using Lisp for practical natural language processing (NLP). It begins with an overview of NLP practice, including research work like setting goals, devising algorithms, training models, and testing accuracy. It then discusses some pros and cons of using Lisp for NLP, including its support for interactivity, mathematical foundations, and tree structures. Examples are given of interactive Lisp programs and APIs. The document emphasizes that data is key for NLP and discusses sources for collecting data. It concludes that Lisp is well-suited for NLP research and development due to its interactive and flexible nature.
This document summarizes a talk given to Python developers about the Lisp programming language. It discusses some myths about Lisp's syntax, libraries, and community. It also highlights features of Lisp like macros, functional programming capabilities, multimethods, special variables, and powerful condition systems. Lisp is described as a multi-paradigm language that is highly customizable through features like macros while also being high performance.
What one needs to know to work in Natural Language Processing field and the aspects of developing an NLP project using the example of a system to identify text language
This document summarizes a talk on sugaring Lisp for the 21st century. It discusses common complaints about Common Lisp including a lack of standard libraries and threading support. It also describes efforts to modernize Common Lisp like CDR and CL21. Additionally, it introduces RUTILS, a package of over 250 symbols that aims to be backward compatible, practical, and modular in evolving Common Lisp. The document encourages attendees to use RUTILS directly or borrow its ideas and discusses some open issues.
This document provides an overview of natural language processing (NLP) including the linguistic basis of NLP, common NLP problems and approaches, sources of NLP data, and steps to develop an NLP system. It discusses tokenization, part-of-speech tagging, parsing, machine learning approaches like naive Bayes classification and dependency parsing, measuring word similarity, and distributional semantics. The document also provides advice on going from research to production systems and notes areas not covered like machine translation and deep learning methods.
This document provides an overview of natural language processing (NLP) including popular NLP problems, levels of NLP, the role of linguistics, sources of NLP data, tools and algorithms used in NLP, types of models including language models, and considerations for building practical NLP systems. It also describes a practical example of building a language detection system using word language models trained on Wiktionary data and evaluated using Wikipedia test data.
NLP is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language. Also called Computational Linguistics – Also concerns how computational methods can aid the understanding of human language
Slides for my talk at SkyCon'12 in Limerick.
Here I've squeezed four talks into one, covering a lot of ground quickly, so I've included links to more detailed presentations and other resources.
This document provides an overview of Elasticsearch and how to use it with .NET. It discusses what Elasticsearch is, how to install it, how Elasticsearch provides scalability through its architecture of clusters, nodes, shards and replicas. It also covers topics like indexing and querying data through the REST API or NEST client for .NET, performing searches, aggregations, highlighting hits, handling human language through analyzers, and using suggesters.
Perl - laziness, impatience, hubris, and one linersKirk Kimmel
Perl provides tools like perldoc, cpan, and Perl::Tidy to help developers work more efficiently. One-liners allow running Perl commands and programs directly from the command line. ExtUtils::Command provides functions that emulate common shell commands to make Perl scripts more portable. Perl::Tidy can reformat code to make it more readable.
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course PROIDEA
PostgreSQL is a battle-tested, open source database with a colorful history dating back to 1987. It has many advantages for a next project, including support for multiple programming languages for stored procedures, handling of XML and JSON, strong error reporting and logging, and window functions. It has a solid architecture with well-designed processes for handling write-ahead logs, statistics collection, and query optimization. While PostgreSQL has a learning curve, its longevity, stability, feature set and performance make it a great choice for many applications.
This document provides an overview of Puppet, an open source configuration management tool. It discusses key Puppet concepts like infrastructure as code, reproducible setups, and aligned environments. It also describes Puppet's architecture including the Puppet master, agent nodes, catalogs, resources, and the lifecycle of a Puppet run. The Puppet language is declarative and node-based. Resources are defined and organized into classes. Relationships between resources can be specified.
This document discusses Domain Specific Languages (DSLs) and how they can be implemented using Ruby and JRuby. It provides examples of internal and external DSLs using annotations, modules, and ActiveRecord associations. JRuby allows Ruby code to interact with Java classes and vice versa, opening up possibilities for DSL implementation across both languages.
- A regular expression (RE) is a pattern that is matched against strings to check if they contain the pattern or extract information from them. REs can be used for validation, parsing, and extraction of data from strings.
- REs are very powerful but can also become complex. They are not always human-readable. While fast to execute, REs require initial compilation into a DFA.
- The document provides examples of basic RE components like ^, $, *, +, ?, character classes and grouping. It also demonstrates matching strings to a RE pattern. More advanced features like lookahead are mentioned but not covered in detail.
The document discusses the history and development of JSON (JavaScript Object Notation). It describes how Douglas Crockford discovered JSON in 2001, developed its specification with a simple one-page website, and then it was adopted widely without much promotion. JSON provided a useful format for browser/server communication and became very popular due to its simplicity, becoming a standard part of JavaScript.
At the Dublin Fashion Insights Centre, we are exploring methods of categorising the web into a set of known fashion related topics. This raises questions such as: How many fashion related topics are there? How closely are they related to each other, or to other non-fashion topics? Furthermore, what topic hierarchies exist in this landscape? Using Clojure and MLlib to harness the data available from crowd-sourced websites such as DMOZ (a categorisation of millions of websites) and Common Crawl (a monthly crawl of billions of websites), we are answering these questions to understand fashion in a quantitative manner.
The latest generation of big data tools such as Apache Spark routinely handle petabytes of data while also addressing real-world realities like node and network failures. Spark's transformations and operations on data sets are a natural fit with Clojure's everyday use of transformations and reductions. Spark MLlib's excellent implementations of distributed machine learning algorithms puts the power of large-scale analytics in the hands of Clojure developers. At Zalando's Dublin Fashion Insights Centre, we're using the Clojure bindings to Spark and MLlib to answer fashion-related questions that until recently have been nearly impossible to answer quantitatively.
Hunter Kelly @retnuh
tech.zalando.com
Building DSLs On CLR and DLR (Microsoft.NET)Vitaly Baum
The document describes a domain specific language (DSL) for specifying tests of a MiniBar simulation using the Specter testing framework in C#. It provides an example context and specifications to test that drinking a beer does not throw an exception, that drinking 5 beers results in a $-5 balance, and that drinking more than 10 beers throws an exception indicating the user is drunk. The specifications are translated to NUnit test methods with asserts to test the MiniBar behavior.
Talking about Neo4j after 1 year of using it production. This presentation covering db structure(internals), cypher queries, extensions development, db tuning & settings.
This document discusses metaprogramming, metaclasses, and the metaobject protocol. It begins with an overview and definitions of key concepts like metaprogramming, metaobjects, and metaclasses. It then covers specific implementations including macros in Lisp, CLOS metaclasses, and how AllegroCache uses a persistent metaclass. Finally, it discusses the metaobject protocol and provides examples of how it allows programs to access and manipulate normally hidden language elements like classes and methods.
The document discusses parsing and Scala parser combinators. It provides an example of using parser combinators to define a parser that parses a line of text into a WordFreq case class with a string and integer field. The parser combinators approach allows defining parsing functions that are combined to parse more complex structures. This provides a robust yet easier way to define parsers compared to other options like hand-written parsers.
This document summarizes an Apache Spark workshop that took place in September 2017 in Stockholm. It introduces the speaker's background and experience with Spark. It then provides an overview of the Spark ecosystem and core concepts like RDDs, DataFrames, and Spark Streaming. Finally, it discusses important Spark concepts like caching, checkpointing, broadcasting, and resilience.
Redis is an in-memory data structure store that can be used as a database, cache, and message broker. It supports string, list, set and sorted set data types and provides operations on each type. Redis is fast, open source, and can be used for tasks like caching, leaderboards, and workload distribution between processes.
Apache Spark is a fast and general engine for large-scale data processing. It uses RDDs (Resilient Distributed Datasets) that allow data to be partitioned across clusters. Spark supports operations like transformations that create new RDDs and actions that return values. Key operations include map, filter, reduceByKey. RDDs can be persisted in memory to improve performance of iterative jobs. Spark runs on clusters managed by YARN, Spark Standalone, or Mesos and provides a driver program and executors on worker nodes to process data in parallel.
Here are a few ways a DSL could potentially benefit a current project:
- Simplify or reduce complexity of certain tasks. A DSL tailored to a specific domain or problem could make those tasks easier to understand and perform.
- Improve productivity of non-programmers. A DSL designed with the intended users in mind could allow others like domain experts or analysts to accomplish things without programming.
- Enforce correctness or best practices. By limiting what can be expressed in a DSL, it reduces the possibility of certain errors or forces compliance with standards.
- Separate logic from implementation. Expressing logic or algorithms in a DSL abstracts it away from implementation details, making it more portable and maintain
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul
Artificial intelligence is changing how businesses operate. Companies are using AI agents to automate tasks, reduce time spent on repetitive work, and focus more on high-value activities. Noah Loul, an AI strategist and entrepreneur, has helped dozens of companies streamline their operations using smart automation. He believes AI agents aren't just tools—they're workers that take on repeatable tasks so your human team can focus on what matters. If you want to reduce time waste and increase output, AI agents are the next move.
Dev Dives: Automate and orchestrate your processes with UiPath MaestroUiPathCommunity
This session is designed to equip developers with the skills needed to build mission-critical, end-to-end processes that seamlessly orchestrate agents, people, and robots.
📕 Here's what you can expect:
- Modeling: Build end-to-end processes using BPMN.
- Implementing: Integrate agentic tasks, RPA, APIs, and advanced decisioning into processes.
- Operating: Control process instances with rewind, replay, pause, and stop functions.
- Monitoring: Use dashboards and embedded analytics for real-time insights into process instances.
This webinar is a must-attend for developers looking to enhance their agentic automation skills and orchestrate robust, mission-critical processes.
👨🏫 Speaker:
Andrei Vintila, Principal Product Manager @UiPath
This session streamed live on April 29, 2025, 16:00 CET.
Check out all our upcoming Dev Dives sessions at https://community.uipath.com/dev-dives-automation-developer-2025/.
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungenpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-und-verwaltung-von-multiuser-umgebungen/
HCL Nomad Web wird als die nächste Generation des HCL Notes-Clients gefeiert und bietet zahlreiche Vorteile, wie die Beseitigung des Bedarfs an Paketierung, Verteilung und Installation. Nomad Web-Client-Updates werden “automatisch” im Hintergrund installiert, was den administrativen Aufwand im Vergleich zu traditionellen HCL Notes-Clients erheblich reduziert. Allerdings stellt die Fehlerbehebung in Nomad Web im Vergleich zum Notes-Client einzigartige Herausforderungen dar.
Begleiten Sie Christoph und Marc, während sie demonstrieren, wie der Fehlerbehebungsprozess in HCL Nomad Web vereinfacht werden kann, um eine reibungslose und effiziente Benutzererfahrung zu gewährleisten.
In diesem Webinar werden wir effektive Strategien zur Diagnose und Lösung häufiger Probleme in HCL Nomad Web untersuchen, einschließlich
- Zugriff auf die Konsole
- Auffinden und Interpretieren von Protokolldateien
- Zugriff auf den Datenordner im Cache des Browsers (unter Verwendung von OPFS)
- Verständnis der Unterschiede zwischen Einzel- und Mehrbenutzerszenarien
- Nutzung der Client Clocking-Funktion
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc
Most consumers believe they’re making informed decisions about their personal data—adjusting privacy settings, blocking trackers, and opting out where they can. However, our new research reveals that while awareness is high, taking meaningful action is still lacking. On the corporate side, many organizations report strong policies for managing third-party data and consumer consent yet fall short when it comes to consistency, accountability and transparency.
This session will explore the research findings from TrustArc’s Privacy Pulse Survey, examining consumer attitudes toward personal data collection and practical suggestions for corporate practices around purchasing third-party data.
Attendees will learn:
- Consumer awareness around data brokers and what consumers are doing to limit data collection
- How businesses assess third-party vendors and their consent management operations
- Where business preparedness needs improvement
- What these trends mean for the future of privacy governance and public trust
This discussion is essential for privacy, risk, and compliance professionals who want to ground their strategies in current data and prepare for what’s next in the privacy landscape.
Mobile App Development Company in Saudi ArabiaSteve Jonas
EmizenTech is a globally recognized software development company, proudly serving businesses since 2013. With over 11+ years of industry experience and a team of 200+ skilled professionals, we have successfully delivered 1200+ projects across various sectors. As a leading Mobile App Development Company In Saudi Arabia we offer end-to-end solutions for iOS, Android, and cross-platform applications. Our apps are known for their user-friendly interfaces, scalability, high performance, and strong security features. We tailor each mobile application to meet the unique needs of different industries, ensuring a seamless user experience. EmizenTech is committed to turning your vision into a powerful digital product that drives growth, innovation, and long-term success in the competitive mobile landscape of Saudi Arabia.
Book industry standards are evolving rapidly. In the first part of this session, we’ll share an overview of key developments from 2024 and the early months of 2025. Then, BookNet’s resident standards expert, Tom Richardson, and CEO, Lauren Stewart, have a forward-looking conversation about what’s next.
Link to recording, presentation slides, and accompanying resource: https://bnctechforum.ca/sessions/standardsgoals-for-2025-standards-certification-roundup/
Presented by BookNet Canada on May 6, 2025 with support from the Department of Canadian Heritage.
Artificial Intelligence is providing benefits in many areas of work within the heritage sector, from image analysis, to ideas generation, and new research tools. However, it is more critical than ever for people, with analogue intelligence, to ensure the integrity and ethical use of AI. Including real people can improve the use of AI by identifying potential biases, cross-checking results, refining workflows, and providing contextual relevance to AI-driven results.
News about the impact of AI often paints a rosy picture. In practice, there are many potential pitfalls. This presentation discusses these issues and looks at the role of analogue intelligence and analogue interfaces in providing the best results to our audiences. How do we deal with factually incorrect results? How do we get content generated that better reflects the diversity of our communities? What roles are there for physical, in-person experiences in the digital world?
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Impelsys Inc.
Impelsys provided a robust testing solution, leveraging a risk-based and requirement-mapped approach to validate ICU Connect and CritiXpert. A well-defined test suite was developed to assess data communication, clinical data collection, transformation, and visualization across integrated devices.
AI and Data Privacy in 2025: Global TrendsInData Labs
In this infographic, we explore how businesses can implement effective governance frameworks to address AI data privacy. Understanding it is crucial for developing effective strategies that ensure compliance, safeguard customer trust, and leverage AI responsibly. Equip yourself with insights that can drive informed decision-making and position your organization for success in the future of data privacy.
This infographic contains:
-AI and data privacy: Key findings
-Statistics on AI data privacy in the today’s world
-Tips on how to overcome data privacy challenges
-Benefits of AI data security investments.
Keep up-to-date on how AI is reshaping privacy standards and what this entails for both individuals and organizations.
Big Data Analytics Quick Research Guide by Arthur MorganArthur Morgan
This is a Quick Research Guide (QRG).
QRGs include the following:
- A brief, high-level overview of the QRG topic.
- A milestone timeline for the QRG topic.
- Links to various free online resource materials to provide a deeper dive into the QRG topic.
- Conclusion and a recommendation for at least two books available in the SJPL system on the QRG topic.
QRGs planned for the series:
- Artificial Intelligence QRG
- Quantum Computing QRG
- Big Data Analytics QRG
- Spacecraft Guidance, Navigation & Control QRG (coming 2026)
- UK Home Computing & The Birth of ARM QRG (coming 2027)
Any questions or comments?
- Please contact Arthur Morgan at [email protected].
100% human made.
Technology Trends in 2025: AI and Big Data AnalyticsInData Labs
At InData Labs, we have been keeping an ear to the ground, looking out for AI-enabled digital transformation trends coming our way in 2025. Our report will provide a look into the technology landscape of the future, including:
-Artificial Intelligence Market Overview
-Strategies for AI Adoption in 2025
-Anticipated drivers of AI adoption and transformative technologies
-Benefits of AI and Big data for your business
-Tips on how to prepare your business for innovation
-AI and data privacy: Strategies for securing data privacy in AI models, etc.
Download your free copy nowand implement the key findings to improve your business.
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfAbi john
Analyze the growth of meme coins from mere online jokes to potential assets in the digital economy. Explore the community, culture, and utility as they elevate themselves to a new era in cryptocurrency.
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfSoftware Company
Explore the benefits and features of advanced logistics management software for businesses in Riyadh. This guide delves into the latest technologies, from real-time tracking and route optimization to warehouse management and inventory control, helping businesses streamline their logistics operations and reduce costs. Learn how implementing the right software solution can enhance efficiency, improve customer satisfaction, and provide a competitive edge in the growing logistics sector of Riyadh.
15. A Special-Purpose Util
(define-lazy-singleton word-tokenizer
(make 'postprocessing-regex-word-tokenizer)
"Default word tokenizer.")
(defun tokenize-ngram (ngrams str)
"Transform string STR to a list if necessary
(depending of order of NGRAMS)."
(if (> (ngrams-order ngrams) 1)
(tokenize <word-tokenizer> str)
str))
18. Basic Cell
(defclass regex-word-tokenizer (tokenizer)
((regex :accessor tokenizer-regex
:initarg :regex
:initform
(re:create-scanner
"w+|[!"#$%&'*+,./:;<=>?@^`~…()
{}[|] «»“”‘’¶-]"⟨⟩ ‒–—― )
:documentation
"A simpler variant would be [^s]+ —
it doesn't split punctuation, yet
sometimes it's desirable."))
(:documentation
"Regex-based word tokenizer."))
19. Basic Cell
(defmethod tokenize
((tokenizer regex-word-tokenizer) string)
(loop
:for (beg end)
:on (re:all-matches (tokenizer-regex
tokenizer)
string)
:by #'cddr
:collect (sub string beg end) :into words
:collect (cons beg end) :into spans
:finally (return (values words
spans)))
20. Another Example
(defgeneric parse (model sentence)
(:documentation
"Parse SENTENCE with MODEL.")
(:method :around (model (sentence string))
(call-next-method
model (tokenize <word-tokenizer> string))))
(defgeneric parse-n (model sentence n)
(:documentation
"Return N best parse trees of the SENTENCE
with MODEL.")
(:method :around (model (sentence string) n)
(call-next-method
model (tokenize <word-tokenizer> string) n)))