Using the python lib NetworkX to calculate stats on a Twitter network, and then display the results in several D3.js visualizations. Links to demos and source files. I'm @arnicas and live at www.ghostweather.com.
Data mining involves multiple steps in the knowledge discovery process including data cleaning, integration, selection, transformation, mining, and pattern evaluation. It has various functionalities including descriptive mining to characterize data, predictive mining for inference, and different mining techniques like classification, association analysis, clustering, and outlier analysis.
The document discusses key concepts of relational databases and relational algebra. It defines what a relation is as a set of tuples with attributes, and covers attribute types, keys, relations schemas and instances. It also summarizes the core relational algebra operations of selection, projection, join, union, difference and Cartesian product and how they are used to manipulate and query relations.
Files and directories can be manipulated in Python using various functions. Files are opened, read from and written to using methods like open(), read(), write() and close(). Directories can be created, listed, changed and deleted using os module functions like mkdir(), listdir(), chdir() and rmdir(). File operations involve opening, performing read/write and closing the file.
The document discusses operator overloading in C++. It explains that operator overloading allows assigning additional operations to operators relative to user-defined classes. It provides examples of overloading the + operator to perform string concatenation. The document outlines how to define operator functions as member functions or non-member friend functions, and lists some rules for operator overloading like not changing operator precedence or syntax.
Data mining primitives include task-relevant data, the kind of knowledge to be mined, background knowledge such as concept hierarchies, interestingness measures, and methods for presenting discovered patterns. A data mining query specifies these primitives to guide the knowledge discovery process. Background knowledge like concept hierarchies allow mining patterns at different levels of abstraction. Interestingness measures estimate pattern simplicity, certainty, utility, and novelty to filter uninteresting results. Discovered patterns can be presented through various visualizations including rules, tables, charts, and decision trees.
Frames allow dividing a browser window into sections that can each load separate HTML documents. The <frameset> tag replaces the <body> tag and defines how to divide the window into rows and columns using frames. Each frame loads a document using the <frame> tag. Inline frames using <iframe> can embed another document anywhere in a page.
The document defines and describes various graph concepts and data structures used to represent graphs. It defines a graph as a collection of nodes and edges, and distinguishes between directed and undirected graphs. It then describes common graph terminology like adjacent/incident nodes, subgraphs, paths, cycles, connected/strongly connected components, trees, and degrees. Finally, it discusses two common ways to represent graphs - the adjacency matrix and adjacency list representations, noting their storage requirements and ability to add/remove nodes.
Functional dependencies in Database Management SystemKevin Jadiya
Slides attached here describes mainly Functional dependencies in database management system, how to find closure set of functional dependencies and in last how decomposition is done in any database tables
This document provides an overview of Java input/output (I/O) concepts including reading from and writing to the console, files, and streams. It discusses different I/O stream classes like PrintStream, InputStream, FileReader, FileWriter, BufferedReader, and how to read/write characters, bytes and objects in Java. The document also introduces new I/O features in Java 7 like try-with-resources for automatic resource management.
This document discusses issues to consider when designing an entity-relationship (ER) diagram. It covers:
1. Whether to represent objects as entity sets or relationship sets.
2. Whether relationships should be binary or n-ary. N-ary relationships can be represented as multiple binary relationships connected through a new entity set.
3. Where to place attributes of relationships, which may depend on the cardinality ratio of the relationship. For example, attributes of a one-to-many relationship can be placed on the "many" side.
HTML frames allow a webpage to be divided into multiple separate windows called frames. Frames are created using the <frameset> tag, which replaces the <body> tag. The <frameset> tag uses attributes like cols and rows to specify the number and size of vertical or horizontal frames. Individual frames are defined using the <frame> tag within the <frameset>, and can specify attributes like name and src. Links between frames are set using the target attribute to specify which frame the linked content should open in.
The document discusses various PHP functions for manipulating files including:
- readfile() which reads a file and writes it to the output buffer
- fopen() which opens files and gives more options than readfile()
- fread() which reads from an open file
- fclose() which closes an open file
- fgets() which reads a single line from a file
- feof() which checks if the end-of-file has been reached
It also discusses sanitizing user input before passing it to execution functions to prevent malicious commands from being run.
Edgar Codd at IBM invented the relational database model in 1970 based on 13 rules. A relational database management system (RDBMS) stores data in related tables. RDBMSs help make data easy to store, retrieve, and combine in useful ways. Common RDBMSs include Microsoft SQL Server, Oracle, MySQL, and PostgreSQL. Tables are related through primary and foreign keys, which help enforce referential integrity.
The document discusses different types of SQL join operations including inner, left, right, and outer joins. It provides examples of each join type using sample tables and explains how the results of each join are determined. Key points covered include how joins combine rows from two or more tables based on matching column values, and how different join types handle rows with no matches differently, such as including or excluding them from results.
This document discusses different types of functions in SQL including string, aggregate, date, and time functions. String functions perform operations on strings and return output strings. Examples of string functions include ASCII, CHAR_LENGTH, and CONCAT. Aggregate functions operate on multiple rows and return a single value, such as COUNT, SUM, AVG, MIN, and MAX. Date functions return date part values and perform date calculations. Time functions extract and format time values.
The document discusses the relational database model. It was introduced in 1970 and became popular due to its simplicity and mathematical foundation. The model represents data as relations (tables) with rows (tuples) and columns (attributes). Keys such as primary keys and foreign keys help define relationships between tables and enforce integrity constraints. The relational model provides a standardized way of structuring data through its use of relations, attributes, tuples and keys.
This document provides an overview of an introductory C# programming course. The course covers C# fundamentals like setting up a development environment, data types, conditionals, loops, object-oriented programming concepts, and data structures. It includes topics like installing Visual Studio, writing a "Hello World" program, built-in data types like string, integer, boolean, and more. The document also outlines sample code solutions for exercises on command line arguments, integer operations, leap year finder, and powers of two.
This document discusses visualizing data in R using various packages and techniques. It introduces ggplot2, a popular package for data visualization that implements Wilkinson's Grammar of Graphics. Ggplot2 can serve as a replacement for base graphics in R and contains defaults for displaying common scales online and in print. The document then covers basic visualizations like histograms, bar charts, box plots, and scatter plots that can be created in R, as well as more advanced visualizations. It also provides examples of code for creating simple time series charts, bar charts, and histograms in R.
Graphs are a data structure composed of nodes connected by edges. There are two main types: directed graphs where edges show a flow between nodes, and undirected graphs where edges simply show a relationship between nodes. Key terminology includes adjacent nodes, paths, cyclic vs acyclic paths, and representations like adjacency matrices and lists. Graphs can model many real-world applications such as social networks, computer networks, road maps, and more.
Python Pandas is a powerful library for data analysis and manipulation. It provides rich data structures and methods for loading, cleaning, transforming, and modeling data. Pandas allows users to easily work with labeled data and columns in tabular structures called Series and DataFrames. These structures enable fast and flexible operations like slicing, selecting subsets of data, and performing calculations. Descriptive statistics functions in Pandas allow analyzing and summarizing data in DataFrames.
The document discusses importing and exporting data in R. It describes how to import data from CSV, TXT, and Excel files using functions like read.table(), read.csv(), and read_excel(). It also describes how to export data to CSV, TXT, and Excel file formats using write functions. The document also demonstrates how to check the structure and dimensions of data, modify variable names, derive new variables, and recode categorical variables in R.
This document provides an overview of timestamp protocols in database management systems. It discusses how timestamps are generated and used to order transactions. The basic timestamp ordering protocol checks timestamps on read and write operations to ensure serializability. Strict timestamp ordering delays some transactions to ensure schedules are both serializable and strict. Multiversion timestamp ordering uses multiple versions of data items to allow reads to always succeed while maintaining serializability.
Deals with CSV Files operations in Pandas like reading, writing, performing joins and other operations in python using dataframes and Series in Pandas.
The document discusses various SQL concepts like views, triggers, functions, indexes, joins, and stored procedures. Views are virtual tables created by joining real tables, and can be updated, modified or dropped. Triggers automatically run code when data is inserted, updated or deleted from a table. Functions allow reusable code and improve clarity. Indexes allow faster data retrieval. Joins combine data from different tables. Stored procedures preserve data integrity.
Introduction to Data Science, Prerequisites (tidyverse), Import Data (readr), Data Tyding (tidyr),
pivot_longer(), pivot_wider(), separate(), unite(), Data Transformation (dplyr - Grammar of Manipulation): arrange(), filter(),
select(), mutate(), summarise()m
Data Visualization (ggplot - Grammar of Graphics): Column Chart, Stacked Column Graph, Bar Graph, Line Graph, Dual Axis Chart, Area Chart, Pie Chart, Heat Map, Scatter Chart, Bubble Chart
The document discusses stacks and queues. It defines stacks as LIFO data structures and queues as FIFO data structures. It describes basic stack operations like push and pop and basic queue operations like enqueue and dequeue. It then discusses implementing stacks and queues using arrays and linked lists, outlining the key operations and memory requirements for each implementation.
A class is a code template for creating objects. Objects have member variables and have behaviour associated with them. In python a class is created by the keyword class.
An object is created using the constructor of the class. This object will then be called the instance of the class.
This document provides an overview of relational algebra operations. There are five basic relational algebra operators: select, project, union, intersection, and cartesian product. The select and project operators are unary operators that operate on a single relation. Select filters rows based on a predicate, while project extracts specified column values. Union and intersection are binary operators that combine two relations, union returning all rows and intersection returning matching rows. Cartesian product returns all possible combinations of rows from two relations. Relational algebra provides a way to formulate queries using these relational operators.
A fast-paced introduction to Deep Learning concepts, such as activation functions, cost functions, backpropagation, and then a quick dive into CNNs. Basic knowledge of vectors, matrices, and elementary calculus (derivatives), are helpful in order to derive the maximum benefit from this session.
Next we'll see a simple neural network using Keras, followed by an introduction to TensorFlow and TensorBoard. (Bonus points if you know Zorn's Lemma, the Well-Ordering Theorem, and the Axiom of Choice.)
This document provides an overview of Java input/output (I/O) concepts including reading from and writing to the console, files, and streams. It discusses different I/O stream classes like PrintStream, InputStream, FileReader, FileWriter, BufferedReader, and how to read/write characters, bytes and objects in Java. The document also introduces new I/O features in Java 7 like try-with-resources for automatic resource management.
This document discusses issues to consider when designing an entity-relationship (ER) diagram. It covers:
1. Whether to represent objects as entity sets or relationship sets.
2. Whether relationships should be binary or n-ary. N-ary relationships can be represented as multiple binary relationships connected through a new entity set.
3. Where to place attributes of relationships, which may depend on the cardinality ratio of the relationship. For example, attributes of a one-to-many relationship can be placed on the "many" side.
HTML frames allow a webpage to be divided into multiple separate windows called frames. Frames are created using the <frameset> tag, which replaces the <body> tag. The <frameset> tag uses attributes like cols and rows to specify the number and size of vertical or horizontal frames. Individual frames are defined using the <frame> tag within the <frameset>, and can specify attributes like name and src. Links between frames are set using the target attribute to specify which frame the linked content should open in.
The document discusses various PHP functions for manipulating files including:
- readfile() which reads a file and writes it to the output buffer
- fopen() which opens files and gives more options than readfile()
- fread() which reads from an open file
- fclose() which closes an open file
- fgets() which reads a single line from a file
- feof() which checks if the end-of-file has been reached
It also discusses sanitizing user input before passing it to execution functions to prevent malicious commands from being run.
Edgar Codd at IBM invented the relational database model in 1970 based on 13 rules. A relational database management system (RDBMS) stores data in related tables. RDBMSs help make data easy to store, retrieve, and combine in useful ways. Common RDBMSs include Microsoft SQL Server, Oracle, MySQL, and PostgreSQL. Tables are related through primary and foreign keys, which help enforce referential integrity.
The document discusses different types of SQL join operations including inner, left, right, and outer joins. It provides examples of each join type using sample tables and explains how the results of each join are determined. Key points covered include how joins combine rows from two or more tables based on matching column values, and how different join types handle rows with no matches differently, such as including or excluding them from results.
This document discusses different types of functions in SQL including string, aggregate, date, and time functions. String functions perform operations on strings and return output strings. Examples of string functions include ASCII, CHAR_LENGTH, and CONCAT. Aggregate functions operate on multiple rows and return a single value, such as COUNT, SUM, AVG, MIN, and MAX. Date functions return date part values and perform date calculations. Time functions extract and format time values.
The document discusses the relational database model. It was introduced in 1970 and became popular due to its simplicity and mathematical foundation. The model represents data as relations (tables) with rows (tuples) and columns (attributes). Keys such as primary keys and foreign keys help define relationships between tables and enforce integrity constraints. The relational model provides a standardized way of structuring data through its use of relations, attributes, tuples and keys.
This document provides an overview of an introductory C# programming course. The course covers C# fundamentals like setting up a development environment, data types, conditionals, loops, object-oriented programming concepts, and data structures. It includes topics like installing Visual Studio, writing a "Hello World" program, built-in data types like string, integer, boolean, and more. The document also outlines sample code solutions for exercises on command line arguments, integer operations, leap year finder, and powers of two.
This document discusses visualizing data in R using various packages and techniques. It introduces ggplot2, a popular package for data visualization that implements Wilkinson's Grammar of Graphics. Ggplot2 can serve as a replacement for base graphics in R and contains defaults for displaying common scales online and in print. The document then covers basic visualizations like histograms, bar charts, box plots, and scatter plots that can be created in R, as well as more advanced visualizations. It also provides examples of code for creating simple time series charts, bar charts, and histograms in R.
Graphs are a data structure composed of nodes connected by edges. There are two main types: directed graphs where edges show a flow between nodes, and undirected graphs where edges simply show a relationship between nodes. Key terminology includes adjacent nodes, paths, cyclic vs acyclic paths, and representations like adjacency matrices and lists. Graphs can model many real-world applications such as social networks, computer networks, road maps, and more.
Python Pandas is a powerful library for data analysis and manipulation. It provides rich data structures and methods for loading, cleaning, transforming, and modeling data. Pandas allows users to easily work with labeled data and columns in tabular structures called Series and DataFrames. These structures enable fast and flexible operations like slicing, selecting subsets of data, and performing calculations. Descriptive statistics functions in Pandas allow analyzing and summarizing data in DataFrames.
The document discusses importing and exporting data in R. It describes how to import data from CSV, TXT, and Excel files using functions like read.table(), read.csv(), and read_excel(). It also describes how to export data to CSV, TXT, and Excel file formats using write functions. The document also demonstrates how to check the structure and dimensions of data, modify variable names, derive new variables, and recode categorical variables in R.
This document provides an overview of timestamp protocols in database management systems. It discusses how timestamps are generated and used to order transactions. The basic timestamp ordering protocol checks timestamps on read and write operations to ensure serializability. Strict timestamp ordering delays some transactions to ensure schedules are both serializable and strict. Multiversion timestamp ordering uses multiple versions of data items to allow reads to always succeed while maintaining serializability.
Deals with CSV Files operations in Pandas like reading, writing, performing joins and other operations in python using dataframes and Series in Pandas.
The document discusses various SQL concepts like views, triggers, functions, indexes, joins, and stored procedures. Views are virtual tables created by joining real tables, and can be updated, modified or dropped. Triggers automatically run code when data is inserted, updated or deleted from a table. Functions allow reusable code and improve clarity. Indexes allow faster data retrieval. Joins combine data from different tables. Stored procedures preserve data integrity.
Introduction to Data Science, Prerequisites (tidyverse), Import Data (readr), Data Tyding (tidyr),
pivot_longer(), pivot_wider(), separate(), unite(), Data Transformation (dplyr - Grammar of Manipulation): arrange(), filter(),
select(), mutate(), summarise()m
Data Visualization (ggplot - Grammar of Graphics): Column Chart, Stacked Column Graph, Bar Graph, Line Graph, Dual Axis Chart, Area Chart, Pie Chart, Heat Map, Scatter Chart, Bubble Chart
The document discusses stacks and queues. It defines stacks as LIFO data structures and queues as FIFO data structures. It describes basic stack operations like push and pop and basic queue operations like enqueue and dequeue. It then discusses implementing stacks and queues using arrays and linked lists, outlining the key operations and memory requirements for each implementation.
A class is a code template for creating objects. Objects have member variables and have behaviour associated with them. In python a class is created by the keyword class.
An object is created using the constructor of the class. This object will then be called the instance of the class.
This document provides an overview of relational algebra operations. There are five basic relational algebra operators: select, project, union, intersection, and cartesian product. The select and project operators are unary operators that operate on a single relation. Select filters rows based on a predicate, while project extracts specified column values. Union and intersection are binary operators that combine two relations, union returning all rows and intersection returning matching rows. Cartesian product returns all possible combinations of rows from two relations. Relational algebra provides a way to formulate queries using these relational operators.
A fast-paced introduction to Deep Learning concepts, such as activation functions, cost functions, backpropagation, and then a quick dive into CNNs. Basic knowledge of vectors, matrices, and elementary calculus (derivatives), are helpful in order to derive the maximum benefit from this session.
Next we'll see a simple neural network using Keras, followed by an introduction to TensorFlow and TensorBoard. (Bonus points if you know Zorn's Lemma, the Well-Ordering Theorem, and the Axiom of Choice.)
2009 - Node XL v.84+ - Social Media Network Visualization Tools For Excel 2007Marc Smith
Overview of the NodeXL project (Network Overview, Discovery and Exploration) that adds social network metrics and visualization features to Excel 2007. Contains updated images from version .84 of the NodeXL project.
NetworkX is a Python language software package and an open-source tool for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. NetworkX can load, store and analyze networks, generate new networks, build network models, and draw networks. It is a computational network modelling tool and not a software tool development. The first public release of the library, which is all based on Python, was in April 2005.
This document provides information about using the JavaScript InfoVis Toolkit (JIT) for data visualization. It discusses feeding JSON tree structure data to JIT visualizations, using controllers to customize visualizations, and exploring different visualization types including treemaps, sunbursts, icicles, and more. It also provides instructions for implementing a basic visualization with JIT by creating the data, HTML, JavaScript, and CSS files needed.
The document provides an overview of deep learning and its applications to Android. It begins with introductions to concepts like linear regression, activation functions, cost functions, and gradient descent. It then discusses neural networks, including convolutional neural networks (CNNs) and their use in image processing. The document outlines several approaches to integrating deep learning models with Android applications, including generating models externally or using pre-trained models. Finally, it discusses future directions for deep learning on Android like TensorFlow Lite.
This document summarizes a project analyzing GitHub user connection data to identify influential users and communities. The project processed over 1TB of GitHub event data from the past 6 months involving over 2 million users and 16 million events to construct a user collaboration graph. Insights from the graph found on average each user collaborates with 6 others, with some users connected to over 1,700 others. Challenges included the unstructured data and optimizing Spark jobs to handle the large data volumes within memory constraints.
This document provides an overview and introduction to deep learning concepts including linear regression, activation functions, gradient descent, backpropagation, hyperparameters, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and TensorFlow. It discusses clustering examples to illustrate neural networks, explores different activation functions and cost functions, and provides code examples of TensorFlow operations, constants, placeholders, and saving graphs.
An introduction to Deep Learning concepts, with a simple yet complete neural network, CNNs, followed by rudimentary concepts of Keras and TensorFlow, and some simple code fragments.
This presentation has been prepared by Oleksii Prohonnyi for LvivJS 2015 conference (http://lvivjs.org.ua/)
See the speech in Russian by the following link: https://youtu.be/oi7JhB8eWnA
A fast-paced introduction to Deep Learning that starts with a simple yet complete neural network (no frameworks), followed by an overview of activation functions, cost functions, backpropagation, and then a quick dive into CNNs. Next we'll create a neural network using Keras, followed by an introduction to TensorFlow and TensorBoard. For best results, familiarity with basic vectors and matrices, inner (aka "dot") products of vectors, and rudimentary Python is definitely helpful.
This document provides an overview of social network analysis using R. It discusses graph construction, visualization, querying, and centrality measures. Various R packages that support social network analysis are also presented, including igraph for network analysis and visualization, and visNetwork for interactive visualization. Finally, further readings and online resources on social network analysis and related R packages are listed.
2009 Node XL Overview: Social Network Analysis in Excel 2007Marc Smith
A quick overview of the features of NodeXL, the network overview, discovery, and exploration add-in for Excel 2007. This tool allows for visualizing directed graphs and social networks within Excel. It provides several network metrics and manipulation tools. Networks can be imported from Twitter and personal email.
The document discusses file management systems and database management systems (DBMS). It describes the different types of file organization including sequential, indexed sequential, and direct access. It also discusses fundamental characteristics of file management systems like creation, updating, retrieval, and maintenance of files. Additionally, it covers topics like data models, DBMS languages, database users, advantages and disadvantages of DBMS, and challenges of data redundancy.
The document describes a student project titled "Error Detection in Big Data on Cloud". The project aims to develop a time-efficient approach for detecting and correcting errors in large sensor data stored on the cloud. If errors are found, the approach also involves error recovery and storing the corrected data in its original format. The proposed method uses algorithms like cyclic redundancy check, Hamming code, and secure hash algorithm to detect and locate errors in big data sets efficiently. Design documents like data flow diagrams, use case diagrams and class diagrams were created to plan the system architecture and implementation of the project.
The document discusses NHibernate, an open source object-relational mapping framework for .NET. It begins by describing some of the limitations of using ADO.NET datasets for data access and how NHibernate provides a more object-oriented approach. It then provides steps to get started with NHibernate, including configuring NHibernate, defining a domain model, mapping the domain model to database tables, and generating the necessary code.
Visualising data: Seeing is Believing - CS Forum 2012Richard Ingram
When patterns and connections are revealed between numbers, content and people that might otherwise be too abstract or scattered to be grasped, we’re able to make better sense of where we are, what it might mean and what needs to be done.
COCO's Memory Palace: A Strange FantasiaLynn Cherny
This document describes COCO's Memory Palace, a project that uses AI image recognition APIs like Google Vision and Amazon Rekognition to analyze images and generate poems. COCO is trained on the MS COCO dataset of 80 common objects. The project combines images, poetry, and AI outputs to strange fantastical effect, mixing dreamlike memories and imaginings. It explores ideas from classical arts of memory involving the association of images to locations for remembering information.
Things I Think Are Awesome (Eyeo 2016 Talk)Lynn Cherny
This document lists various links and ideas related to poetry, algorithms, and art. It discusses using algorithms and bots to generate or remix poetry in novel ways. It also advocates for building tools to support poetry, sharing the work of others, and focusing on nuance over novelty when studying algorithmically generated poems. The document suggests keeping poetry alive by making things with and for people, and celebrating both human and algorithmic creative works.
This document provides an overview of visualizing networks and some key concepts for understanding network structure and relationships. It begins by defining what a network is as a data structure of entities and relationships. It then discusses common network measures like degree, betweenness, eigenvector centrality, and community detection algorithms. Examples are given of visualizing network data in node-link diagrams and matrices. The document emphasizes that network visualizations should be used alongside calculating network metrics to understand a network's structure.
Bestseller Analysis: Visualization Fiction (for PyData Boston 2013)Lynn Cherny
A version of my OpenVisConf talk "Bones of a Bestseller" that gives more detail on topic analysis plus adds python code. Blog post and ipynb code here: http://blogger.ghostweather.com/2013/08/pydata-boston-2013-more-on-fiction.html
The Bones of a Bestseller: Visualizing FictionLynn Cherny
This document summarizes Lynn Cherny's presentation on visualizing fiction using statistical analysis and machine learning techniques. Some key points:
- Cherny used text classification models to detect sex scenes in 50 Shades of Grey with around 88% accuracy by training on manually labeled text chunks.
- Topic modeling was applied to Dan Brown novels to analyze "exciting scenes" but bag-of-words classifiers only achieved around 60% accuracy.
- An interactive visualization was created to show topics associated with ordered chapters in The Da Vinci Code along with chapter summaries.
- The presentation featured examples of classifying texts, crowdsourcing annotations, model evaluation, and interactive visualizations of results.
Design For Online Community: Beyond the HypeLynn Cherny
This document discusses design considerations for online communities. It begins by introducing the speaker and their credentials working with early online communities. It then discusses the business reasons for interest in online communities in the Web 2.0 era. The main part of the document outlines a plan of action for designing successful online communities, including defining community goals, understanding different definitions of community, identifying who will build and moderate the community, applying design principles, and measuring community success.
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfAbi john
Analyze the growth of meme coins from mere online jokes to potential assets in the digital economy. Explore the community, culture, and utility as they elevate themselves to a new era in cryptocurrency.
Generative Artificial Intelligence (GenAI) in BusinessDr. Tathagat Varma
My talk for the Indian School of Business (ISB) Emerging Leaders Program Cohort 9. In this talk, I discussed key issues around adoption of GenAI in business - benefits, opportunities and limitations. I also discussed how my research on Theory of Cognitive Chasms helps address some of these issues
How Can I use the AI Hype in my Business Context?Daniel Lehner
𝙄𝙨 𝘼𝙄 𝙟𝙪𝙨𝙩 𝙝𝙮𝙥𝙚? 𝙊𝙧 𝙞𝙨 𝙞𝙩 𝙩𝙝𝙚 𝙜𝙖𝙢𝙚 𝙘𝙝𝙖𝙣𝙜𝙚𝙧 𝙮𝙤𝙪𝙧 𝙗𝙪𝙨𝙞𝙣𝙚𝙨𝙨 𝙣𝙚𝙚𝙙𝙨?
Everyone’s talking about AI but is anyone really using it to create real value?
Most companies want to leverage AI. Few know 𝗵𝗼𝘄.
✅ What exactly should you ask to find real AI opportunities?
✅ Which AI techniques actually fit your business?
✅ Is your data even ready for AI?
If you’re not sure, you’re not alone. This is a condensed version of the slides I presented at a Linkedin webinar for Tecnovy on 28.04.2025.
Semantic Cultivators : The Critical Future Role to Enable AIartmondano
By 2026, AI agents will consume 10x more enterprise data than humans, but with none of the contextual understanding that prevents catastrophic misinterpretations.
Procurement Insights Cost To Value Guide.pptxJon Hansen
Procurement Insights integrated Historic Procurement Industry Archives, serves as a powerful complement — not a competitor — to other procurement industry firms. It fills critical gaps in depth, agility, and contextual insight that most traditional analyst and association models overlook.
Learn more about this value- driven proprietary service offering here.
AI and Data Privacy in 2025: Global TrendsInData Labs
In this infographic, we explore how businesses can implement effective governance frameworks to address AI data privacy. Understanding it is crucial for developing effective strategies that ensure compliance, safeguard customer trust, and leverage AI responsibly. Equip yourself with insights that can drive informed decision-making and position your organization for success in the future of data privacy.
This infographic contains:
-AI and data privacy: Key findings
-Statistics on AI data privacy in the today’s world
-Tips on how to overcome data privacy challenges
-Benefits of AI data security investments.
Keep up-to-date on how AI is reshaping privacy standards and what this entails for both individuals and organizations.
HCL Nomad Web – Best Practices and Managing Multiuser Environmentspanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-and-managing-multiuser-environments/
HCL Nomad Web is heralded as the next generation of the HCL Notes client, offering numerous advantages such as eliminating the need for packaging, distribution, and installation. Nomad Web client upgrades will be installed “automatically” in the background. This significantly reduces the administrative footprint compared to traditional HCL Notes clients. However, troubleshooting issues in Nomad Web present unique challenges compared to the Notes client.
Join Christoph and Marc as they demonstrate how to simplify the troubleshooting process in HCL Nomad Web, ensuring a smoother and more efficient user experience.
In this webinar, we will explore effective strategies for diagnosing and resolving common problems in HCL Nomad Web, including
- Accessing the console
- Locating and interpreting log files
- Accessing the data folder within the browser’s cache (using OPFS)
- Understand the difference between single- and multi-user scenarios
- Utilizing Client Clocking
Quantum Computing Quick Research Guide by Arthur MorganArthur Morgan
This is a Quick Research Guide (QRG).
QRGs include the following:
- A brief, high-level overview of the QRG topic.
- A milestone timeline for the QRG topic.
- Links to various free online resource materials to provide a deeper dive into the QRG topic.
- Conclusion and a recommendation for at least two books available in the SJPL system on the QRG topic.
QRGs planned for the series:
- Artificial Intelligence QRG
- Quantum Computing QRG
- Big Data Analytics QRG
- Spacecraft Guidance, Navigation & Control QRG (coming 2026)
- UK Home Computing & The Birth of ARM QRG (coming 2027)
Any questions or comments?
- Please contact Arthur Morgan at [email protected].
100% human made.
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025BookNet Canada
Book industry standards are evolving rapidly. In the first part of this session, we’ll share an overview of key developments from 2024 and the early months of 2025. Then, BookNet’s resident standards expert, Tom Richardson, and CEO, Lauren Stewart, have a forward-looking conversation about what’s next.
Link to recording, transcript, and accompanying resource: https://bnctechforum.ca/sessions/standardsgoals-for-2025-standards-certification-roundup/
Presented by BookNet Canada on May 6, 2025 with support from the Department of Canadian Heritage.
Artificial Intelligence is providing benefits in many areas of work within the heritage sector, from image analysis, to ideas generation, and new research tools. However, it is more critical than ever for people, with analogue intelligence, to ensure the integrity and ethical use of AI. Including real people can improve the use of AI by identifying potential biases, cross-checking results, refining workflows, and providing contextual relevance to AI-driven results.
News about the impact of AI often paints a rosy picture. In practice, there are many potential pitfalls. This presentation discusses these issues and looks at the role of analogue intelligence and analogue interfaces in providing the best results to our audiences. How do we deal with factually incorrect results? How do we get content generated that better reflects the diversity of our communities? What roles are there for physical, in-person experiences in the digital world?
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc
Most consumers believe they’re making informed decisions about their personal data—adjusting privacy settings, blocking trackers, and opting out where they can. However, our new research reveals that while awareness is high, taking meaningful action is still lacking. On the corporate side, many organizations report strong policies for managing third-party data and consumer consent yet fall short when it comes to consistency, accountability and transparency.
This session will explore the research findings from TrustArc’s Privacy Pulse Survey, examining consumer attitudes toward personal data collection and practical suggestions for corporate practices around purchasing third-party data.
Attendees will learn:
- Consumer awareness around data brokers and what consumers are doing to limit data collection
- How businesses assess third-party vendors and their consent management operations
- Where business preparedness needs improvement
- What these trends mean for the future of privacy governance and public trust
This discussion is essential for privacy, risk, and compliance professionals who want to ground their strategies in current data and prepare for what’s next in the privacy landscape.
Technology Trends in 2025: AI and Big Data AnalyticsInData Labs
At InData Labs, we have been keeping an ear to the ground, looking out for AI-enabled digital transformation trends coming our way in 2025. Our report will provide a look into the technology landscape of the future, including:
-Artificial Intelligence Market Overview
-Strategies for AI Adoption in 2025
-Anticipated drivers of AI adoption and transformative technologies
-Benefits of AI and Big data for your business
-Tips on how to prepare your business for innovation
-AI and data privacy: Strategies for securing data privacy in AI models, etc.
Download your free copy nowand implement the key findings to improve your business.
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell
With expertise in data architecture, performance tracking, and revenue forecasting, Andrew Marnell plays a vital role in aligning business strategies with data insights. Andrew Marnell’s ability to lead cross-functional teams ensures businesses achieve sustainable growth and operational excellence.
Dev Dives: Automate and orchestrate your processes with UiPath MaestroUiPathCommunity
This session is designed to equip developers with the skills needed to build mission-critical, end-to-end processes that seamlessly orchestrate agents, people, and robots.
📕 Here's what you can expect:
- Modeling: Build end-to-end processes using BPMN.
- Implementing: Integrate agentic tasks, RPA, APIs, and advanced decisioning into processes.
- Operating: Control process instances with rewind, replay, pause, and stop functions.
- Monitoring: Use dashboards and embedded analytics for real-time insights into process instances.
This webinar is a must-attend for developers looking to enhance their agentic automation skills and orchestrate robust, mission-critical processes.
👨🏫 Speaker:
Andrei Vintila, Principal Product Manager @UiPath
This session streamed live on April 29, 2025, 16:00 CET.
Check out all our upcoming Dev Dives sessions at https://community.uipath.com/dev-dives-automation-developer-2025/.
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...SOFTTECHHUB
I started my online journey with several hosting services before stumbling upon Ai EngineHost. At first, the idea of paying one fee and getting lifetime access seemed too good to pass up. The platform is built on reliable US-based servers, ensuring your projects run at high speeds and remain safe. Let me take you step by step through its benefits and features as I explain why this hosting solution is a perfect fit for digital entrepreneurs.
2. Plan
The Problem: Hairballs.
NetworkX – one tool
Stats on networks (and getting them from NetworkX)
Visualizing networks – some options
D3 demos of several
Lots of Links for Learning More
Lynn Cherny,
3/18/2012
[email protected]
3. The Problem: Moritz Stefaner’s Dataset on Twitter “Infovis” Folks
See http://well-formed-data.net/archives/642/the-vizosphere 3/18/2012
4. Intro to NetworkX
A Python Library for Network / Graph analysis and
teaching, housed and documented well at:
http://networkx.lanl.gov/index.html
Lynn Cherny,
3/18/2012
[email protected]
5. Aside on My Overall Code Strategy
1. Read in edgelist to NetworkX / (or read in JSON)
2. Convert to NetworkX graph object
3. Calculate stats & save values as node attributes in
the graph
(Verify it’s done with various inspections of the objects)
4. Write out JSON of nodes, edges and their attributes
to use elsewhere
5. Move to D3 to visualize. Reduce the
problem: send
6. Go back to 1 and restart to revise stats.
fewer nodes to
JSON; or filter
visible nodes in
UI/vis
Lynn Cherny,
3/18/2012
[email protected]
7. Example Code from NetworkX
def calculate_degree_centrality(graph):
g = graph
dc = nx.degree_centrality(g)
nx.set_node_attributes(g,'degree_cent',dc)
degcent_sorted = sorted(dc.items(), key=itemgetter(1), reverse=True)
for key,value in degcent_sorted[0:10]:
print "Highest degree Centrality:", key, value
return graph, dc
Highest degree Centrality: flowingdata 0.848447961047
Highest degree Centrality: datavis 0.837492391966
Highest degree Centrality: infosthetics 0.828971393792
Highest degree Centrality: infobeautiful 0.653682288497
Highest degree Centrality: blprnt 0.567255021302
Highest degree Centrality: ben_fry 0.536822884967
Highest degree Centrality: moritz_stefaner 0.529519172246
Highest degree Centrality: eagereyes 0.524041387705
Highest degree Centrality: mslima 0.503956177724
Highest degree Centrality: VizWorld 0.503956177724
There are similar functions for
other stats in my code outline.
Lynn Cherny,
3/18/2012
[email protected]
8. Betweenness
A measure of connectedness between (sub)components of
the graph
http://en.wikipedia.org/wiki/Centrality#Betweenness_centrality
Lynn Cherny,
3/18/2012
[email protected]
11. Use Multiple Stats…
Drew Conway’s recent post on central leaders in China:
See also the excellent article by Valid Krebs in First Monday on terrorist
networks, using other network metrics in conjunction.
Lynn Cherny,
3/18/2012
[email protected]
13. Community Detection Algorithms
E.g., the Louvain method, implemented in a lib that works
with NetworkX
def find_partition(graph):
# from http://perso.crans.org/aynaud/communities/
g = graph
partition = community.best_partition( g )
print "Partitions found: ", len(set(partition.values()))
print "Partition for node Arnicas: ", partition["arnicas"]
nx.set_node_attributes(g,'partition',partition)
return g, partition
http://en.wikipedia.org/wiki/File:Network_Community_Structure.png
Lynn Cherny,
3/18/2012
[email protected]
14. Dump Partition Number by Node
def write_node_attributes(graph, attributes):
# utility function to let you print the node + various attributes in a csv format
if type(attributes) is not list:
attributes = [attributes]
for node in graph.nodes():
vals = [str(dict[node]) for dict in [nx.get_node_attributes(graph,x) for x in attributes]]
print node, ",", ",".join(vals)
Lynn Cherny,
3/18/2012
[email protected]
16. Aside: NetworkX I/O utility
functions
Input -- List of edge pairs in txt file (e.g., “a b”)
Networkx.read_edgelist converts a file of node pairs to a graph:
def read_in_edges(filename):
g_orig = nx.read_edgelist(filename, create_using=nx.DiGraph())
print "Read in edgelist file ", filename
print nx.info(g_orig)
return g_orig
Input or Output -- JSON
NetworkX.readwrite.json_graph.node_link_data
def save_to_jsonfile(filename, graph):
g = graph
g_json = json_graph.node_link_data(g)
json.dump(g_json, open(filename,'w'))
NetworkX.readwrite.json_graph.load
def read_json_file(filename):
graph = json_graph.load(open(filename))
print "Read in file ", filename
print nx.info(data)
return graph
Lynn Cherny,
3/18/2012
[email protected]
17. Saving a Subset…
For most of my visualization demos, I used a subset of
the full dataset. I sorted the 1644 nodes by eigenvector
centrality score and then saved only the top 100.
Code from my networkx_functs.py file:
eigen_sorted = sorted(eigen.items(), key=itemgetter(1), reverse=True)
for key, val in eigen_sorted[0:5]:
print “Highest eigenvector centrality nodes:", key, val
# for trimming the dataset, you want it reverse sorted, with low values on top.
eigen_sorted = sorted(eigen.items(), key=itemgetter(1), reverse=False)
small_graph = trim_nodes_by_attribute_for_remaining_number(undir_g, eigen_sorted, 100)
print nx.info(small_graph)
#save as json for use in javascript - small graph, and full graph if you want
save_to_jsonfile(path+outputjsonfile, small_graph)
Lynn Cherny,
3/18/2012
[email protected]
18. Dump JSON of a graph
(after my NetworkX calcs)
Works with all D3
examples I’ll show…
Lynn Cherny,
3/18/2012
[email protected]
19. Gotchas to be aware of here
If you don’t use the “DiGraph” (directed graph) class in
NetworkX, you will lose some links. This changes some
visuals.
Your json links are based on index of the node. If/when
you do filtering in JSON based on, say, UI controls, you
need to redo your indexing on your links!
[e.g., See my code in demo full_fonts.html]
Lynn Cherny,
3/18/2012
[email protected]
20. Visualizing Networks
NetworkX isn’t really for vis – can use graphViz and other
layouts for static pics.
Use Gephi to explore and calculate stats, too.
See my blog post and slideshare with UI screencaps of Gephi, using this same data
set!
Apart from the hairball, there are other methods to visualize
graphs:
− See Robert Kosara’s post: http://eagereyes.org/techniques/graphs-hairball
− Lane Harrison’s post: http://blog.visual.ly/network-visualizations/
− MS Lima’s book Visual Complexity
Like many big data problems, use multiple stats and
multiple methods to explore!
Lynn Cherny,
3/18/2012
[email protected]
21. D3.js by Mike Bostock
D3 allows creation of interactive visualizations…
Adjacency Matrix Chord Diagram Networks
Lynn Cherny,
3/18/2012
[email protected]
22. Aside on Data Set Size
Adjacency matrix only holds a small number
of nodes at a time – I used 88 of the top 100
selected by eigenvector centrality for this
demo.
Chord diagrams are simplified reps of a
dataset – comparing the relations between
the top 100 by eigenvector centrality vs. the
whole 1644 nodes set reveals a most
interesting insight!
Interactive network vis is limited by browser
performance – and complexity of hairballs. If
you want it to be interactive (live) and not a
static image, you probably need to reduce
your data before or after loading.
Lynn Cherny,
3/18/2012
[email protected]
24. What did this show?
Be sure to change the sort order on the right side:
The largest partition represented is the orange partition,
when you sort by partition (= subcommunity)
Some partitions (colors) have very few representatives in
the matrix of the top 88. We can suppose these
partitions are not composed of people with the highest
eigenvector centrality scores.
Node VizWorld is near or at the top in all the sort-by-
attribute methods offered, and is in the red partition, not
orange.
Lynn Cherny,
3/18/2012
[email protected]
25. Chord Diagram: Summarize
Top 100 nodes by
Relations
eigenvector
centrality, chords
by target:
Not representative
of full network…
Interesting!
Lynn Cherny, Demo 3/18/2012
[email protected]
27. Insights from Comparing Them
The top 100 nodes by eigenvector
centrality are mostly the orange
partition. The green partition,
however, is the largest group in the
unfiltered set (the whole 1644
nodes).
− Notice how few green and purple partition
members “make” the top eigencentric list:
You can see supporting evidence of
the orange eigenvector centrality by
looking at how many people link to
them from other partitions. Change
the target/source radio button on
the demo to see this in action.
Lynn Cherny,
3/18/2012
[email protected]
28. Handling with Graph Rendering…
Typical Nodes/Edges, with sizing/coloring – slow, still a
hairball, not much visible useful info.
Avoid this simplistic vis method if you can…
Note this takes a little while to calm down!
Demo redballs
Alternate, slightly better: Names, color by Partition, UI
options and edges on click.
Demo force_fonts
Lynn Cherny,
3/18/2012
[email protected]
29. Viewing the top scoring subset
only….
Even with a
small
subset and
partition
clustering,
showing all
the links is
a visual
mess…
So only
show them
on demand.
Lynn Cherny,
3/18/2012
[email protected]
30. Design Tweaks Made To Make It
(More) Useful
Add a click-action to
− Fade out nodes unrelated to clicked node
− Show lines indicating who they follow
− Show the names (unlinked) of who follows them
Add a tooltip showing on-screen degree (i.e., following
and followed-by numbers for the subset)
Heavily adjusted layout to separate clusters visually (lots
of trial and error, see following slides)
Add stats on the sidebar showing some numbers, to
allow you to compare, for instance, onscreen degree vs.
degree in the whole set of 1644 nodes:
Lynn Cherny,
3/18/2012
[email protected]
31. Creating the subset in JS instead
of NetworkX
To create the union of the top N by each attribute, I
shortcutted and used underscore.js’s union function:
Then you need to update your links by filtering for the
links referring to the subset of nodes, and fix the indices!
Lynn Cherny,
3/18/2012
[email protected]
32. Insights from Subset
The most “readable”
view is with fonts
sized by
“Betweenness”
because of the large
discrepancies:
Note that these look
familiar from the
Adjacency Matrix
view!
Lynn Cherny,
3/18/2012
[email protected]
33. How to “read” the layout
Nodes that drift towards the middle are linked to more
partition colors and nodes in the visible subset. Tooltips
show the following/follower relations for the subset only.
Nodes towards the fringes are less linked in general
inside this subset.
Itoworld, on the edge:
Lynn Cherny,
3/18/2012
[email protected]
34. Interesting Oddities
wattenberg is in
the orange
partition, but
within the top HansRosling
N nodes, follows no one
follows* mostly but is followed by
green: quite a few of the
visible subset:
In general, the
top N of the
green partition Ditto
follow each nytgraphics:
other. They’re
the artists!
* This data set is from mid-2011!
Lynn Cherny,
3/18/2012
[email protected]
36. Wrap up with an aside from a
Frank van Ham Talk
http://bit.ly/s6udpy
37. Reminder(s)
The map is not the territory.
Just cuz social media software tools allow links between
people doesn’t mean they reflect the true – or complete
– social network of relationships.
(Also, this data set is no doubt out of date with respect to
current follower relations!)
Lynn Cherny,
3/18/2012
[email protected]
39. General Network Primer Material
MIT OpenCourseware on Networks, Complexity,
Applications (many references!)
Frank van Ham’s slides from a recent datavis meetup
CRAN R code/links for handling graphs/networks
Chapter 10 of Rajaraman & Ullman and book on Data
Mining of Massive Datasets
Graph Theory with Applications by Bondy and Murty
Intro to Social Network Methods by Hanneman and
Riddle
Networks, Crowds, and Markets by Easley and Kleinberg
My lists of sna / networks papers on delicious
Lynn Cherny,
3/18/2012
[email protected]
40. NetworkX Info/Tutorials
NetworkX site docs/tutorial:
http://networkx.lanl.gov/tutorial/index.html
UC Dublin web science summer school data sets, slides,
references: http://mlg.ucd.ie/summer
Stanford basic intro tutorial:
http://www.stanford.edu/class/cs224w/nx_tutorial/nx_tuto
rial.pdf
Lynn Cherny,
3/18/2012
[email protected]
41. D3 Example Links (for networks)
D3.js – Mike Bostock
Super Useful force attributes explanation from Jim
Vallandingham
D3 Demo Talk Slides with embedded code by MBostock
Chicago Lobbyists by Manning
Mobile Patent Suits by Mbostock
Rollover selective highlight code by Manning
D3 Adjacency Matrix by Mbostock
Chord diagram: http://bost.ocks.org/mike/uberdata/
My giant, growing list of D3 links on delicious
Lynn Cherny,
3/18/2012
[email protected]
42. Community Detection (a couple)
Overlapping Community Detection in Networks: State of
the Art and Comparative Study by Jierui Xie, Stephen
Kelley, Boleslaw K. Szymanski
Empirical Comparison of Algorithms for Network
Community Detection by Leskovec, Lang, Mahoney
Lynn Cherny,
3/18/2012
[email protected]
43. Sources of “Canned” Network Data
Koblenz Network Collection
CMU’s CASOS
INSNA.org’s member datasets
Lynn Cherny,
3/18/2012
[email protected]
44. Blog Post and Links
Zip file of slides, networkx code, and edgelist:
− http://www.ghostweather.com/essays/talks/networkx/source.zip
Blog post with links and more text:
− http://blogger.ghostweather.com/2012/03/digging-into-networkx-and-d3.html
Lynn Cherny,
3/18/2012
[email protected]