This document provides an overview of the Python programming language. It discusses that Python is an easy to use, open-source scripting language (3 sentences or less).
The process of data warehousing is undergoing rapidtransformation, giving rise to various new terminologies, especially due to theshift from the traditional ETL to the new ELT. Forsomeone new to the process, these additional terminologies and abbreviationsmight seem overwhelming, some may even ask, “Why does it matter if the L comesbefore the T?”
The answer lies in the infrastructure and the setup. Here iswhat the fuss is all about, the sequencing of the words and more importantly,why you should be shifting from ETL to ELT.
The document discusses the pros and cons of staging data warehouses using Extract, Transform, Load (ETL) versus Extract, Load, Transform (ELT). Some key advantages of ETL include reducing development time by only extracting relevant data, simplifying administration through targeted data loads, and flexibility in tool selection. However, ETL can limit flexibility for future requirements and necessitate additional hardware. ELT enables project management in smaller chunks, flexibility for future needs, and risk minimization through isolated processes, but availability of mature tools is still limited as the technology is emerging.
ETL (Extract, Transform, Load) is a process that allows companies to consolidate data from multiple sources into a single target data store, such as a data warehouse. It involves extracting data from heterogeneous sources, transforming it to fit operational needs, and loading it into the target data store. ETL tools automate this process, allowing companies to access and analyze consolidated data for critical business decisions. Popular ETL tools include IBM Infosphere Datastage, Informatica, and Oracle Warehouse Builder.
This document provides an introduction to NoSQL and MongoDB. It discusses that NoSQL is a non-relational database management system that avoids joins and is easy to scale. It then summarizes the different flavors of NoSQL including key-value stores, graphs, BigTable, and document stores. The remainder of the document focuses on MongoDB, describing its structure, how to perform inserts and searches, features like map-reduce and replication. It concludes by encouraging the reader to try MongoDB themselves.
In this report, I have reviewed the top five cloud service providers based on their revenue, popularity, and service offerings. As of May 2020, Amazon Web Services leads all the way as a leader in its ability to execute, but Microsoft’s Azure leads as a visionary. Google Cloud Platform is third in the race followed by Alibaba Cloud, and IBM Cloud.
Impala is an open source SQL query engine for Apache Hadoop that allows real-time queries on large datasets stored in HDFS and other data stores. It uses a distributed architecture where an Impala daemon runs on each node and coordinates query planning and execution across nodes. Impala allows SQL queries to be run directly against files stored in HDFS and other formats like Avro and Parquet. It aims to provide high performance for both analytical and transactional workloads through its C++ implementation and avoidance of MapReduce.
This document discusses optimizing Power BI and Analysis Services (AAS) data models. It outlines the required tools for reviewing and optimizing models, including DAX Studio, Vertipaq Analyzer, Tabular Editor, and Best Practice Analyzer. The document provides steps for initial review of table sizes, column sizes and cardinality, and DAX expressions. It also lists best practices for data model design like star schema, only including needed data, and simplifying reporting and naming. Resources for learning more about optimization and troubleshooting performance are also included.
This document provides an overview of Hadoop and its ecosystem. It describes Hadoop as a framework for distributed storage and processing of large datasets across clusters of commodity hardware. The key components of Hadoop are the Hadoop Distributed File System (HDFS) for storage, and MapReduce as a programming model for distributed computation across large datasets. A variety of related projects form the Hadoop ecosystem, providing capabilities like data integration, analytics, workflow scheduling and more.
There are two main types of relational database management systems (RDBMS): row-based and columnar. Row-based systems store all of a row's data contiguously on disk, while columnar systems store each column's data together across all rows. Columnar databases are generally better for read-heavy workloads like data warehousing that involve aggregating or retrieving subsets of columns, whereas row-based databases are better for transactional systems that require updating or retrieving full rows frequently. The optimal choice depends on the specific access patterns and usage of the data.
What is a Data Warehouse and How Do I Test It?RTTS
ETL Testing: A primer for Testers on Data Warehouses, ETL, Business Intelligence and how to test them.
Are you hearing and reading about Big Data, Enterprise Data Warehouses (EDW), the ETL Process and Business Intelligence (BI)? The software markets for EDW and BI are quickly approaching $22 billion, according to Gartner, and Big Data is growing at an exponential pace.
Are you being tasked to test these environments or would you like to learn about them and be prepared for when you are asked to test them?
RTTS, the Software Quality Experts, provided this groundbreaking webinar, based upon our many years of experience in providing software quality solutions for more than 400 companies.
You will learn the answer to the following questions:
• What is Big Data and what does it mean to me?
• What are the business reasons for a building a Data Warehouse and for using Business Intelligence software?
• How do Data Warehouses, Business Intelligence tools and ETL work from a technical perspective?
• Who are the primary players in this software space?
• How do I test these environments?
• What tools should I use?
This slide deck is geared towards:
QA Testers
Data Architects
Business Analysts
ETL Developers
Operations Teams
Project Managers
...and anyone else who is (a) new to the EDW space, (b) wants to be educated in the business and technical sides and (c) wants to understand how to test them.
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...Edureka!
The document discusses Extract, Transform, Load (ETL) and Talend as an ETL tool. It states that ETL provides a one-stop solution for issues like data being scattered across different locations and sources, in different formats and volumes increasing. It describes the three processes of ETL - extract, transform and load. It then discusses Talend as an open-source ETL tool, how Talend Open Studio can easily manage the ETL process with drag-and-drop functionality, and its strong connectivity and smooth extraction and transformation capabilities.
Big data architectures and the data lakeJames Serra
The document provides an overview of big data architectures and the data lake concept. It discusses why organizations are adopting data lakes to handle increasing data volumes and varieties. The key aspects covered include:
- Defining top-down and bottom-up approaches to data management
- Explaining what a data lake is and how Hadoop can function as the data lake
- Describing how a modern data warehouse combines features of a traditional data warehouse and data lake
- Discussing how federated querying allows data to be accessed across multiple sources
- Highlighting benefits of implementing big data solutions in the cloud
- Comparing shared-nothing, massively parallel processing (MPP) architectures to symmetric multi-processing (
CERN is expanding its computing infrastructure to support growing data and computing needs. It is adopting open source tools like Puppet for configuration management and OpenStack for cloud computing. CERN plans to deploy OpenStack into production in 2013 to manage over 15,000 hypervisors and 100,000 VMs across its data centers by 2015, supporting both traditional and cloud-based workflows. This will enable CERN to more efficiently manage resources and better support dynamic workloads and temporary spikes in demand.
These slides present how DBT, Coral, and Iceberg can provide a novel data management experience for defining SQL workflows. In this UX, users define their workflows as a cascade of SQL queries, which then get auto-materialized and incrementally maintained. Applications of this user experience include Declarative DAG workflows, streaming/batch convergence, and materialized views.
Learn to Use Databricks for Data ScienceDatabricks
Data scientists face numerous challenges throughout the data science workflow that hinder productivity. As organizations continue to become more data-driven, a collaborative environment is more critical than ever — one that provides easier access and visibility into the data, reports and dashboards built against the data, reproducibility, and insights uncovered within the data.. Join us to hear how Databricks’ open and collaborative platform simplifies data science by enabling you to run all types of analytics workloads, from data preparation to exploratory analysis and predictive analytics, at scale — all on one unified platform.
This document provides an introduction and overview of Apache Spark with Python (PySpark). It discusses key Spark concepts like RDDs, DataFrames, Spark SQL, Spark Streaming, GraphX, and MLlib. It includes code examples demonstrating how to work with data using PySpark for each of these concepts.
Building Reliable Data Lakes at Scale with Delta LakeDatabricks
Most data practitioners grapple with data reliability issues—it’s the bane of their existence. Data engineers, in particular, strive to design, deploy, and serve reliable data in a performant manner so that their organizations can make the most of their valuable corporate data assets.
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Built on open standards, Delta Lake employs co-designed compute and storage and is compatible with Spark API’s. It powers high data reliability and query performance to support big data use cases, from batch and streaming ingests, fast interactive queries to machine learning. In this tutorial we will discuss the requirements of modern data engineering, the challenges data engineers face when it comes to data reliability and performance and how Delta Lake can help. Through presentation, code examples and notebooks, we will explain these challenges and the use of Delta Lake to address them. You will walk away with an understanding of how you can apply this innovation to your data architecture and the benefits you can gain.
This tutorial will be both instructor-led and hands-on interactive session. Instructions on how to get tutorial materials will be covered in class.
What you’ll learn:
Understand the key data reliability challenges
How Delta Lake brings reliability to data lakes at scale
Understand how Delta Lake fits within an Apache Spark™ environment
How to use Delta Lake to realize data reliability improvements
Prerequisites
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Pre-register for Databricks Community Edition
NoSQL databases get a lot of press coverage, but there seems to be a lot of confusion surrounding them, as in which situations they work better than a Relational Database, and how to choose one over another. This talk will give an overview of the NoSQL landscape and a classification for the different architectural categories, clarifying the base concepts and the terminology, and will provide a comparison of the features, the strengths and the drawbacks of the most popular projects (CouchDB, MongoDB, Riak, Redis, Membase, Neo4j, Cassandra, HBase, Hypertable).
Lessons from building a stream-first metadata platform | Shirshanka Das, StealthHostedbyConfluent
Shirshanka Das presents on building LinkedIn DataHub, a metadata platform. He discusses how enterprises need to solve data discovery, governance and data ops challenges. LinkedIn DataHub addresses this by liberating metadata from various systems and ingesting it into a centralized platform using Kafka. The metadata is then stored, indexed and served to enable data discovery, social search, data lineage and other capabilities. He outlines the overall architecture and how the stream-first approach supports use cases like operational metadata and data quality monitoring.
How To Connect Spark To Your Own DatasourceMongoDB
1) Ross Lawley presented on connecting Spark to MongoDB. The MongoDB Spark connector started as an intern project in 2015 and was officially launched in 2016, written in Scala with Python and R support.
2) To read data from MongoDB, the connector partitions the collection, optionally using preferred shard locations for locality. It computes each partition's data as an iterator to be consumed by Spark.
3) For writing data, the connector groups data into batches by partition and inserts into MongoDB collections. DataFrames/Datasets will upsert if there is an ID.
4) The connector supports structured data in Spark by inferring schemas, creating relations, and allowing multi-language access from Scala, Python and R
Machine Learning Data Lineage with MLflow and Delta LakeDatabricks
This document discusses machine learning data lineage using Delta Lake. It introduces Richard Zang and Denny Lee, then outlines the machine learning lifecycle and challenges of model management. It describes how MLflow Model Registry can track model versions, stages, and metadata. It also discusses how Delta Lake allows data to be processed continuously and incrementally in a data lake. Delta Lake uses a transaction log and file format to provide ACID transactions and allow optimistic concurrency control for conflicts.
This document discusses efficient analysis of big data using the MapReduce framework. It introduces the challenges of analyzing large and complex datasets, and describes how MapReduce addresses these challenges through its map and reduce functions. MapReduce allows distributed processing of big data across clusters of computers using a simple programming model.
In Data Engineer’s Lunch #41: Pygrametl , we discussed PygramETL, a python ETL tool in order to close out our series on them.
Accompanying Blog: https://blog.anant.us/data-engineers-lunch-41-pygrametl
Accompanying YouTube: https://youtu.be/YiPuJyYLXxs
Sign Up For Our Newsletter: http://eepurl.com/grdMkn
Join Data Engineer’s Lunch Weekly at 12 PM EST Every Monday:
https://www.meetup.com/Data-Wranglers-DC/events/
Cassandra.Link:
https://cassandra.link/
Follow Us and Reach Us At:
Anant:
https://www.anant.us/
Awesome Cassandra:
https://github.com/Anant/awesome-cassandra
Email:
[email protected]
LinkedIn:
https://www.linkedin.com/company/anant/
Twitter:
https://twitter.com/anantcorp
Eventbrite:
https://www.eventbrite.com/o/anant-1072927283
Facebook:
https://www.facebook.com/AnantCorp/
Join The Anant Team:
https://www.careers.anant.us
The document describes the software architecture of Informatica PowerCenter ETL product. It consists of 3 main components: 1) Client tools that enable development and monitoring. 2) A centralized repository that stores all metadata. 3) The server that executes mappings and loads data into targets. The architecture diagram shows the data flow from sources to targets via the server.
This session covers how to work with PySpark interface to develop Spark applications. From loading, ingesting, and applying transformation on the data. The session covers how to work with different data sources of data, apply transformation, python best practices in developing Spark Apps. The demo covers integrating Apache Spark apps, In memory processing capabilities, working with notebooks, and integrating analytics tools into Spark Applications.
Both Python and PHP are interpreted scripting languages popular for web applications due to facilitating rapid prototyping. They are both open source and can run on almost all platforms without recompilation, making them extremely portable. PHP is more commonly used for web development as it can be directly implemented in HTML files and has more resources. However, Python requires understanding additional technologies like CGI and WSGI to use for web applications.
The document compares several major programming platforms: C++, Java, C#, and PHP. It provides pros and cons for each platform, discussing performance, cross-platform capabilities, trends, and other factors. It focuses on differences between C++, Java, C# on .NET, C# on Mono, and web development platforms like PHP that are commonly used with LAMP stacks.
This document provides an overview of Hadoop and its ecosystem. It describes Hadoop as a framework for distributed storage and processing of large datasets across clusters of commodity hardware. The key components of Hadoop are the Hadoop Distributed File System (HDFS) for storage, and MapReduce as a programming model for distributed computation across large datasets. A variety of related projects form the Hadoop ecosystem, providing capabilities like data integration, analytics, workflow scheduling and more.
There are two main types of relational database management systems (RDBMS): row-based and columnar. Row-based systems store all of a row's data contiguously on disk, while columnar systems store each column's data together across all rows. Columnar databases are generally better for read-heavy workloads like data warehousing that involve aggregating or retrieving subsets of columns, whereas row-based databases are better for transactional systems that require updating or retrieving full rows frequently. The optimal choice depends on the specific access patterns and usage of the data.
What is a Data Warehouse and How Do I Test It?RTTS
ETL Testing: A primer for Testers on Data Warehouses, ETL, Business Intelligence and how to test them.
Are you hearing and reading about Big Data, Enterprise Data Warehouses (EDW), the ETL Process and Business Intelligence (BI)? The software markets for EDW and BI are quickly approaching $22 billion, according to Gartner, and Big Data is growing at an exponential pace.
Are you being tasked to test these environments or would you like to learn about them and be prepared for when you are asked to test them?
RTTS, the Software Quality Experts, provided this groundbreaking webinar, based upon our many years of experience in providing software quality solutions for more than 400 companies.
You will learn the answer to the following questions:
• What is Big Data and what does it mean to me?
• What are the business reasons for a building a Data Warehouse and for using Business Intelligence software?
• How do Data Warehouses, Business Intelligence tools and ETL work from a technical perspective?
• Who are the primary players in this software space?
• How do I test these environments?
• What tools should I use?
This slide deck is geared towards:
QA Testers
Data Architects
Business Analysts
ETL Developers
Operations Teams
Project Managers
...and anyone else who is (a) new to the EDW space, (b) wants to be educated in the business and technical sides and (c) wants to understand how to test them.
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...Edureka!
The document discusses Extract, Transform, Load (ETL) and Talend as an ETL tool. It states that ETL provides a one-stop solution for issues like data being scattered across different locations and sources, in different formats and volumes increasing. It describes the three processes of ETL - extract, transform and load. It then discusses Talend as an open-source ETL tool, how Talend Open Studio can easily manage the ETL process with drag-and-drop functionality, and its strong connectivity and smooth extraction and transformation capabilities.
Big data architectures and the data lakeJames Serra
The document provides an overview of big data architectures and the data lake concept. It discusses why organizations are adopting data lakes to handle increasing data volumes and varieties. The key aspects covered include:
- Defining top-down and bottom-up approaches to data management
- Explaining what a data lake is and how Hadoop can function as the data lake
- Describing how a modern data warehouse combines features of a traditional data warehouse and data lake
- Discussing how federated querying allows data to be accessed across multiple sources
- Highlighting benefits of implementing big data solutions in the cloud
- Comparing shared-nothing, massively parallel processing (MPP) architectures to symmetric multi-processing (
CERN is expanding its computing infrastructure to support growing data and computing needs. It is adopting open source tools like Puppet for configuration management and OpenStack for cloud computing. CERN plans to deploy OpenStack into production in 2013 to manage over 15,000 hypervisors and 100,000 VMs across its data centers by 2015, supporting both traditional and cloud-based workflows. This will enable CERN to more efficiently manage resources and better support dynamic workloads and temporary spikes in demand.
These slides present how DBT, Coral, and Iceberg can provide a novel data management experience for defining SQL workflows. In this UX, users define their workflows as a cascade of SQL queries, which then get auto-materialized and incrementally maintained. Applications of this user experience include Declarative DAG workflows, streaming/batch convergence, and materialized views.
Learn to Use Databricks for Data ScienceDatabricks
Data scientists face numerous challenges throughout the data science workflow that hinder productivity. As organizations continue to become more data-driven, a collaborative environment is more critical than ever — one that provides easier access and visibility into the data, reports and dashboards built against the data, reproducibility, and insights uncovered within the data.. Join us to hear how Databricks’ open and collaborative platform simplifies data science by enabling you to run all types of analytics workloads, from data preparation to exploratory analysis and predictive analytics, at scale — all on one unified platform.
This document provides an introduction and overview of Apache Spark with Python (PySpark). It discusses key Spark concepts like RDDs, DataFrames, Spark SQL, Spark Streaming, GraphX, and MLlib. It includes code examples demonstrating how to work with data using PySpark for each of these concepts.
Building Reliable Data Lakes at Scale with Delta LakeDatabricks
Most data practitioners grapple with data reliability issues—it’s the bane of their existence. Data engineers, in particular, strive to design, deploy, and serve reliable data in a performant manner so that their organizations can make the most of their valuable corporate data assets.
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Built on open standards, Delta Lake employs co-designed compute and storage and is compatible with Spark API’s. It powers high data reliability and query performance to support big data use cases, from batch and streaming ingests, fast interactive queries to machine learning. In this tutorial we will discuss the requirements of modern data engineering, the challenges data engineers face when it comes to data reliability and performance and how Delta Lake can help. Through presentation, code examples and notebooks, we will explain these challenges and the use of Delta Lake to address them. You will walk away with an understanding of how you can apply this innovation to your data architecture and the benefits you can gain.
This tutorial will be both instructor-led and hands-on interactive session. Instructions on how to get tutorial materials will be covered in class.
What you’ll learn:
Understand the key data reliability challenges
How Delta Lake brings reliability to data lakes at scale
Understand how Delta Lake fits within an Apache Spark™ environment
How to use Delta Lake to realize data reliability improvements
Prerequisites
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Pre-register for Databricks Community Edition
NoSQL databases get a lot of press coverage, but there seems to be a lot of confusion surrounding them, as in which situations they work better than a Relational Database, and how to choose one over another. This talk will give an overview of the NoSQL landscape and a classification for the different architectural categories, clarifying the base concepts and the terminology, and will provide a comparison of the features, the strengths and the drawbacks of the most popular projects (CouchDB, MongoDB, Riak, Redis, Membase, Neo4j, Cassandra, HBase, Hypertable).
Lessons from building a stream-first metadata platform | Shirshanka Das, StealthHostedbyConfluent
Shirshanka Das presents on building LinkedIn DataHub, a metadata platform. He discusses how enterprises need to solve data discovery, governance and data ops challenges. LinkedIn DataHub addresses this by liberating metadata from various systems and ingesting it into a centralized platform using Kafka. The metadata is then stored, indexed and served to enable data discovery, social search, data lineage and other capabilities. He outlines the overall architecture and how the stream-first approach supports use cases like operational metadata and data quality monitoring.
How To Connect Spark To Your Own DatasourceMongoDB
1) Ross Lawley presented on connecting Spark to MongoDB. The MongoDB Spark connector started as an intern project in 2015 and was officially launched in 2016, written in Scala with Python and R support.
2) To read data from MongoDB, the connector partitions the collection, optionally using preferred shard locations for locality. It computes each partition's data as an iterator to be consumed by Spark.
3) For writing data, the connector groups data into batches by partition and inserts into MongoDB collections. DataFrames/Datasets will upsert if there is an ID.
4) The connector supports structured data in Spark by inferring schemas, creating relations, and allowing multi-language access from Scala, Python and R
Machine Learning Data Lineage with MLflow and Delta LakeDatabricks
This document discusses machine learning data lineage using Delta Lake. It introduces Richard Zang and Denny Lee, then outlines the machine learning lifecycle and challenges of model management. It describes how MLflow Model Registry can track model versions, stages, and metadata. It also discusses how Delta Lake allows data to be processed continuously and incrementally in a data lake. Delta Lake uses a transaction log and file format to provide ACID transactions and allow optimistic concurrency control for conflicts.
This document discusses efficient analysis of big data using the MapReduce framework. It introduces the challenges of analyzing large and complex datasets, and describes how MapReduce addresses these challenges through its map and reduce functions. MapReduce allows distributed processing of big data across clusters of computers using a simple programming model.
In Data Engineer’s Lunch #41: Pygrametl , we discussed PygramETL, a python ETL tool in order to close out our series on them.
Accompanying Blog: https://blog.anant.us/data-engineers-lunch-41-pygrametl
Accompanying YouTube: https://youtu.be/YiPuJyYLXxs
Sign Up For Our Newsletter: http://eepurl.com/grdMkn
Join Data Engineer’s Lunch Weekly at 12 PM EST Every Monday:
https://www.meetup.com/Data-Wranglers-DC/events/
Cassandra.Link:
https://cassandra.link/
Follow Us and Reach Us At:
Anant:
https://www.anant.us/
Awesome Cassandra:
https://github.com/Anant/awesome-cassandra
Email:
[email protected]
LinkedIn:
https://www.linkedin.com/company/anant/
Twitter:
https://twitter.com/anantcorp
Eventbrite:
https://www.eventbrite.com/o/anant-1072927283
Facebook:
https://www.facebook.com/AnantCorp/
Join The Anant Team:
https://www.careers.anant.us
The document describes the software architecture of Informatica PowerCenter ETL product. It consists of 3 main components: 1) Client tools that enable development and monitoring. 2) A centralized repository that stores all metadata. 3) The server that executes mappings and loads data into targets. The architecture diagram shows the data flow from sources to targets via the server.
This session covers how to work with PySpark interface to develop Spark applications. From loading, ingesting, and applying transformation on the data. The session covers how to work with different data sources of data, apply transformation, python best practices in developing Spark Apps. The demo covers integrating Apache Spark apps, In memory processing capabilities, working with notebooks, and integrating analytics tools into Spark Applications.
Both Python and PHP are interpreted scripting languages popular for web applications due to facilitating rapid prototyping. They are both open source and can run on almost all platforms without recompilation, making them extremely portable. PHP is more commonly used for web development as it can be directly implemented in HTML files and has more resources. However, Python requires understanding additional technologies like CGI and WSGI to use for web applications.
The document compares several major programming platforms: C++, Java, C#, and PHP. It provides pros and cons for each platform, discussing performance, cross-platform capabilities, trends, and other factors. It focuses on differences between C++, Java, C# on .NET, C# on Mono, and web development platforms like PHP that are commonly used with LAMP stacks.
This document summarizes a company's transition from a PHP codebase to Python. It discusses why they made the change, how they approached it incrementally instead of a full rewrite, and what they learned from the process. Key points include adopting Django and SQLAlchemy, improving testing, and maintaining the existing PHP session handling. The transition faced challenges but increased developer productivity and the ability to add new features more quickly. Overall the experience reinforced the benefits of Python for their needs.
United Global Soft provides online and classroom training in .NET and C# concepts through experienced instructors. They offer comprehensive courses covering topics such as ASP.NET, ADO.NET, databases, AJAX, and more. Students receive training materials, exercises, projects, interview preparation assistance, and help finding jobs upon completion. The training is designed to provide practical skills for professional development.
Unix is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, developed in the 1970s at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and others.
Initially intended for use inside the Bell System, AT&T licensed Unix to outside parties from the late 1970s, leading to a variety of both academic and commercial variants of Unix from vendors such as the University of California, Berkeley (BSD), Microsoft (Xenix), IBM (AIX) and Sun Microsystems (Solaris). AT&T finally sold its rights in Unix to Novell in the early 1990s, which then sold its Unix business to the Santa Cruz Operation (SCO) in 1995,[4] but the UNIX trademark passed to the industry standards consortium The Open Group, which allows the use of the mark for certified operating systems compliant with the Single UNIX Specification. Among these is Apple's OS X, which is the Unix version with the largest installed base as of 2014.
The document provides an introduction to Unix and security policies. It discusses the history of Unix, common Unix commands and files, basic Unix concepts like permissions and processes. It also outlines steps for developing security policies, including getting management approval, writing clear policies, publishing them, allowing exceptions, and training users. The document recommends monitoring logs, restricting access and permissions, and references resources for further information.
This lecture overviews today leading technologies in web applications development and provides a detailed comparison between the three. This lecture is relevant both for software developers and software development managers who need to select which technology to use, for students who are doing their first steps in the practical world and for people without and background in software development.
More information about the Java course I deliver can be found at java.course.lifemichael.com
More information about the PHP course I deliver can be found at php.course.lifemichael.com
More information about the C# course I deliver can be found at csharp.course.lifemichael.com
Python is an interpreted, open source programming language that is simple, powerful, and preinstalled on many systems. It has less syntax than other languages and a plethora of penetration testing tools have already been created in Python. Python code is translated and executed by an interpreter one statement at a time, allowing it to be run from the command prompt, through command prompt files, or in an integrated development environment. The language uses whitespace and comments to make code more readable. It can perform basic operations like printing, taking user input, performing conditionals and loops, defining reusable functions, and importing additional modules.
El documento introduce los conceptos básicos de .NET, incluyendo su estructura, componentes y herramientas. Visual.NET es un lenguaje de programación orientado a objetos que implementa el framework .NET. El framework .NET proporciona el entorno de trabajo y componentes como CLR, MSIL, ADO.NET y ASP.NET para el desarrollo y ejecución de aplicaciones. La plataforma .NET también incluye lenguajes como C# y VB.NET, bibliotecas de clases y está construida sobre una arquitectura abierta.
The document discusses the Unix operating system. It describes Unix systems as using plain text storage, a hierarchical file system, and treating devices as files. It also discusses the Unix philosophy of using small, strung together programs instead of large monolithic programs. The document then summarizes Unix kernel subsystems like process management and memory management. It provides an overview of shell scripts, their advantages, and how to create and use variables within scripts.
The document compares and contrasts Java and .NET frameworks. It discusses how both use intermediate languages - Java uses bytecode that runs on the Java Virtual Machine (JVM) and .NET uses Microsoft Intermediate Language (MSIL) that runs on the Common Language Runtime (CLR). Both frameworks allow applications to run across platforms, however the JVM was designed for platform independence while .NET was initially only supported on Windows. The document also notes that the choice between Java and .NET often comes down to non-technical factors like developer skills and customer/vendor preferences.
Linux is an operating system similar to Unix. The document lists and describes 27 common Linux commands, including commands for listing files (ls), removing files and directories (rm, rmdir), viewing file contents (cat, more, less), navigating and creating directories (cd, mkdir), moving and copying files (mv, cp), searching files (grep), counting characters (wc), checking the current working directory (pwd), getting command help (man), finding files and programs (whereis, find, locate), editing files (vi, emacs), connecting remotely (telnet, ssh), checking network status (netstat, ifconfig), getting information about internet hosts (whois, nslookup, dig, finger), testing network connectivity
VB.NET provides tools for accessing and manipulating database content using ADO.NET objects like the Connection, Command, DataAdapter and DataSet. The DataAdapter fills a DataSet with data retrieved from a database using an OLE DB or ODBC connection. Bound controls can then display and edit this data. Unbound controls can navigate records by changing the current position in the DataSet using methods like Find, MoveNext and filtering with parameter queries.
Study of similiarities and difference between android and ios system archiitecture in operating system perspective like thread management process management memory management etc more technical details
This document provides commands for basic file management and system utilities in Linux/Unix systems. It includes commands for listing, moving, copying, deleting and changing permissions of files and directories. It also includes commands for editing files, finding files, archiving files, printing files, managing processes, debugging programs, I/O redirection and setting environment variables.
Open source general-purpose. Multiplatform programming language
Object Oriented, Procedural, Functional
Easy to interface with C/ObjC/Java/Fortran
Easy to interface with C++ (via SWIG)
Great interactive environment
Python 'philosophy' emphasis readability, clarity and simplicity
The Interactive Interpreter
it is very easy to learn and understand.
Java vs. C#
The document compares Java and C# programming languages. It discusses some key differences:
1. Syntax differences such as main method signatures, print statements, and array declarations are slightly different between the two languages.
2. Some concepts are modified in C# from Java, such as polymorphism requiring the virtual keyword, operator overloading restrictions, and switch statements allowing string cases.
3. C# introduces new concepts not in Java like enumerations, foreach loops, properties to encapsulate fields, pointers in unsafe contexts, and passing arguments by reference.
Introduction to .NET Framework and C# (English)Vangos Pterneas
A brief introduction to .NET Framework and C# for a presentation in Athens University of Economics and Business (in English). MSDN Academic Alliance and Imagine Cup are also discussed.
Presenters:
Vangos Pterneas (http://twitter.com/Pterneas)
Pavlos Touroulitis
Alex Tzanetopoulos (http://twitter.com/nerdtechnews)
Date: October 26, 2010
This document provides an introduction to the Python programming language. It discusses that Python is an interpreted, object-oriented language that was first released in 1990 and was designed by Guido van Rossum. It also highlights that Python is easy to learn, readable, simple, and multipurpose. Examples of Python code and comparisons to R are provided. Popular online resources for learning Python are listed. The document also discusses Python's uses in areas like application development, web development, scientific computing, and more. Pros and cons of Python are outlined.
Python Programming Unit1_Aditya College of Engg & TechRamanamurthy Banda
Python was created in the late 1980s by Guido van Rossum at CWI in the Netherlands as a successor to the ABC language. It is an interpreted, object-oriented programming language that is easy to read and maintain. Python code is portable and can be used for web development, desktop GUIs, games, data science, and more due to its large standard library and extensive third party libraries. Some limitations are that performance is not as fast as lower-level compiled languages and it is not well-suited for mobile applications.
COMPUTER LANGUAGES AND THERE DIFFERENCE Pavan Kalyan
In this ppt you will understand the difference among languages and You will know what is necessary for a language to become best in the present software filed
Vision Academy is a well known Computer Training Institute in Hadapsar Pune from 2005.This institute was started by its visionary director Mr Sachin Zurange. Mr Sachin Zurange was completed MSc(Scientific Computing) From Interdisciplinary School Of Scientific Computing, University of Pune. It also clear SET exam in May 2018. We provides BCS, BCA, BBA(Comp.App), MCS, MCA, Dip(Comp), BE(Comp/IT) Coaching Classes in Hadapsar Pune. We mainly impart training in programming languages C,C++, Java, Advanced Java, Php, Phyton, .NET,HTML, Java Script, jQuery, Angular Js. Database Languages such as Oracle, Postgres, Mysql, SQL Server & focus on key subjects like Data Structure, Operating System,Rdbms. We provides career oriented programs in Web Design, WordPress, Digital Marketing courses. More then 10,000 students was trained from Vision Academy. We provide 100% practical oriented training program with 100% job placement.
Python Training in Pune - Ethans Tech PuneEthan's Tech
This document provides an overview of Module 1 of a Python training course. It discusses why Python is used, its history and origins from Monty Python, and the key features of Python like its scripting capabilities, portability, and use in various industries. The module objectives are to write a first Python program, use variables and keywords, and get experience with the interactive shell. It also covers installing Python, differences between Python 2 and 3, and taking the first steps in Python like running a simple print statement program.
A Research Study of Data Collection and Analysis of Semantics of Programming ...IRJET Journal
This document summarizes a research study on data collection and analysis of programming language semantics. It discusses several key programming languages like C++, C, Pascal, Fortran, Java, Perl, PHP, and Scheme. It analyzes the features and usage of these languages. It also compares Python and R as good options for beginners in data science and discusses why Python may have a lower learning curve. Finally, it discusses the importance of incorporating semantic results into practical systems to help language designers and programmers better understand languages.
Vision Academe Pune is a leading institute in Pune conducting training programs for various software fields & provides certification for both individuals and organizations.Training Institute Pune is subsidiary of Optimized Infotech which offers IT services and training.
Vision Academy’s Python Certification Training not only focuses on fundamentals of Python, Statistics and Machine Learning but also helps one gain expertise in applied Data Science at scale using Python. The training is a step by step guide to Python and Data Science with extensive hands on.
The Vision Phyton Institute Hands Over You The Best phyton Course In Pune. With 12 Years of Proficiency In Ecommerce, Phyton, Block Chain, Big Data, Data Science, and Many More. Trained Over 5000+ Placed Phyton . With The Help Of Our Highly Qualified Experienced Faculty Members And there Endless Efforts to bring the best out of every Student With us. As We provide Relevant Information, Data From Various Experts With Additional Reference Content, And Unconditional Practical Knowledge.
This document provides notes on the Python programming language. It begins with a brief history of Python, noting it was created by Guido van Rossum in 1991. It then discusses several key features of Python, including that it is easy to learn and use, interpreted, cross-platform, free and open source, supports object-oriented programming, GUI programming, dynamic memory allocation, and is embeddable in other languages. Examples of applications of Python are also provided, such as for web development, desktop GUIs, scientific computing, business applications, and more. The document concludes with discussions of Python identifiers, keywords, comments, indentation, and variables.
Python was created in the late 1980s by Guido van Rossum as a successor to the ABC programming language. It uses dynamic typing and garbage collection for memory management. Key features include its clear syntax, object orientation, modularity through packages, and extensive standard libraries. Python code is highly readable and portable across operating systems.
This document provides an introduction to the Python programming language. It discusses what Python is, why it was created, its basic features and uses. Python is an interpreted, object-oriented programming language that is designed to be readable. It can be used for tasks such as web development, scientific computing, and scripting. The document also covers Python basics like variables, data types, operators, and input/output functions. It provides examples of Python code and discusses best practices for writing and running Python programs.
Python is a powerful and object-oriented programming language that has grown rapidly in popularity due to its simplicity and flexibility. It supports multiple programming paradigms and has a large standard library. Python source code is first compiled to bytecode, which is then executed by the Python Virtual Machine. While Java may be faster for single algorithms, Python is easier for beginners to learn and its dynamic typing and automatic memory management make programs quicker to write. It has gained widespread use for web development, data science, and scripting.
Python for Science and Engineering: a presentation to A*STAR and the Singapor...pythoncharmers
An introduction to Python in science and engineering.
The presentation was given by Dr Edward Schofield of Python Charmers (www.pythoncharmers.com) to A*STAR and the Singapore Computational Sciences Club in June 2011.
Best E-commerce Website, Wordpress, Python Development Company India, France, Canada, US. We provide Online Development Services, 2X Faster Delivery. Hire remote developers in india.
This presentation is a part of the COP2271C college level course taught at the Florida Polytechnic University located in Lakeland Florida. The purpose of this course is to introduce Freshmen students to both the process of software development and to the Python language.
The course is one semester in length and meets for 2 hours twice a week. The Instructor is Dr. Jim Anderson.
A video of Dr. Anderson using these slides is available on YouTube at: https://www.youtube.com/watch?feature=player_embedded&v=_LxfIQuFALY
Python is a widely used general purpose programming language that was created in the late 1980s by Guido van Rossum. It emphasizes code readability and has a large standard library. It supports multiple programming paradigms like object oriented, imperative, and functional programming. Compared to other languages, Python programs are typically shorter than equivalent programs in languages like Java due to features like dynamic typing.
1. The document provides an overview of the major Java 2 Enterprise Edition (J2EE) technologies including Java servlets, JavaServer Pages (JSP), Enterprise JavaBeans (EJB), Java Message Service (JMS), Java Database Connectivity (JDBC), and Java Naming and Directory Interface (JNDI).
2. It describes the basic anatomy and functionality of servlets, JSP, EJB components including session and entity beans, and JMS.
3. Examples of simple servlet, JSP, EJB, and JMS code are included to illustrate how each technology can be implemented.
The document defines key concepts for computing full disjunctions from a set of relations, including:
- Universal and maximal tuples that integrate connected and join-consistent tuples from the relations.
- The full disjunction, which is the set of all maximal integrated tuples that can be generated from tuples in the relations.
- Matching graphs used to represent matches between a query and database, with strata used to extend cyclic matching graphs.
1) The document discusses querying incomplete data using different semantics like OR-semantics and weak semantics to return maximal answers rather than complete answers.
2) It describes computing full disjunctions of relations as a special case of weak semantics and how full disjunctions generalize to allow non-equality join constraints.
3) The complexity of evaluating maximal weak matchings and maximal OR matchings for cyclic queries is polynomial in the size of the query, database, and result.
The document discusses full disjunctions, which is a variation of the join operator that maximally combines join-consistent tuples from connected relations while preserving all information in the relations. An example is provided to illustrate a full disjunction using three sample relations - Climates, Accommodations, and Sites. The full disjunction returns all possible combinations of tuples from the relations.
The document discusses algorithms for computing full disjunctions over multiple relations. Full disjunctions maximally combine tuples from relations while allowing for incompleteness. The paper presents three algorithms: one for tree-structured schemas, one for general schemas, and an improved main algorithm. The algorithms aim to compute full disjunctions with polynomial delay, producing each result incrementally and in time linear in the output size.
The document discusses semantic search engines and describes two related projects called SEWASIE and WISDOM. SEWASIE used a two-level architecture with data integration at the peer and super-peer levels to allow querying across multiple data sources. It exploited ontologies and mappings between data sources and the global schema. WISDOM improved on this with a distributed peer-to-peer architecture and use of domain ontologies to integrate data at both the peer and network levels.
This is the keynote of the Into the Box conference, highlighting the release of the BoxLang JVM language, its key enhancements, and its vision for the future.
Big Data Analytics Quick Research Guide by Arthur MorganArthur Morgan
This is a Quick Research Guide (QRG).
QRGs include the following:
- A brief, high-level overview of the QRG topic.
- A milestone timeline for the QRG topic.
- Links to various free online resource materials to provide a deeper dive into the QRG topic.
- Conclusion and a recommendation for at least two books available in the SJPL system on the QRG topic.
QRGs planned for the series:
- Artificial Intelligence QRG
- Quantum Computing QRG
- Big Data Analytics QRG
- Spacecraft Guidance, Navigation & Control QRG (coming 2026)
- UK Home Computing & The Birth of ARM QRG (coming 2027)
Any questions or comments?
- Please contact Arthur Morgan at [email protected].
100% human made.
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...Alan Dix
Talk at the final event of Data Fusion Dynamics: A Collaborative UK-Saudi Initiative in Cybersecurity and Artificial Intelligence funded by the British Council UK-Saudi Challenge Fund 2024, Cardiff Metropolitan University, 29th April 2025
https://alandix.com/academic/talks/CMet2025-AI-Changes-Everything/
Is AI just another technology, or does it fundamentally change the way we live and think?
Every technology has a direct impact with micro-ethical consequences, some good, some bad. However more profound are the ways in which some technologies reshape the very fabric of society with macro-ethical impacts. The invention of the stirrup revolutionised mounted combat, but as a side effect gave rise to the feudal system, which still shapes politics today. The internal combustion engine offers personal freedom and creates pollution, but has also transformed the nature of urban planning and international trade. When we look at AI the micro-ethical issues, such as bias, are most obvious, but the macro-ethical challenges may be greater.
At a micro-ethical level AI has the potential to deepen social, ethnic and gender bias, issues I have warned about since the early 1990s! It is also being used increasingly on the battlefield. However, it also offers amazing opportunities in health and educations, as the recent Nobel prizes for the developers of AlphaFold illustrate. More radically, the need to encode ethics acts as a mirror to surface essential ethical problems and conflicts.
At the macro-ethical level, by the early 2000s digital technology had already begun to undermine sovereignty (e.g. gambling), market economics (through network effects and emergent monopolies), and the very meaning of money. Modern AI is the child of big data, big computation and ultimately big business, intensifying the inherent tendency of digital technology to concentrate power. AI is already unravelling the fundamentals of the social, political and economic world around us, but this is a world that needs radical reimagining to overcome the global environmental and human challenges that confront us. Our challenge is whether to let the threads fall as they may, or to use them to weave a better future.
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfSoftware Company
Explore the benefits and features of advanced logistics management software for businesses in Riyadh. This guide delves into the latest technologies, from real-time tracking and route optimization to warehouse management and inventory control, helping businesses streamline their logistics operations and reduce costs. Learn how implementing the right software solution can enhance efficiency, improve customer satisfaction, and provide a competitive edge in the growing logistics sector of Riyadh.
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungenpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-und-verwaltung-von-multiuser-umgebungen/
HCL Nomad Web wird als die nächste Generation des HCL Notes-Clients gefeiert und bietet zahlreiche Vorteile, wie die Beseitigung des Bedarfs an Paketierung, Verteilung und Installation. Nomad Web-Client-Updates werden “automatisch” im Hintergrund installiert, was den administrativen Aufwand im Vergleich zu traditionellen HCL Notes-Clients erheblich reduziert. Allerdings stellt die Fehlerbehebung in Nomad Web im Vergleich zum Notes-Client einzigartige Herausforderungen dar.
Begleiten Sie Christoph und Marc, während sie demonstrieren, wie der Fehlerbehebungsprozess in HCL Nomad Web vereinfacht werden kann, um eine reibungslose und effiziente Benutzererfahrung zu gewährleisten.
In diesem Webinar werden wir effektive Strategien zur Diagnose und Lösung häufiger Probleme in HCL Nomad Web untersuchen, einschließlich
- Zugriff auf die Konsole
- Auffinden und Interpretieren von Protokolldateien
- Zugriff auf den Datenordner im Cache des Browsers (unter Verwendung von OPFS)
- Verständnis der Unterschiede zwischen Einzel- und Mehrbenutzerszenarien
- Nutzung der Client Clocking-Funktion
AI and Data Privacy in 2025: Global TrendsInData Labs
In this infographic, we explore how businesses can implement effective governance frameworks to address AI data privacy. Understanding it is crucial for developing effective strategies that ensure compliance, safeguard customer trust, and leverage AI responsibly. Equip yourself with insights that can drive informed decision-making and position your organization for success in the future of data privacy.
This infographic contains:
-AI and data privacy: Key findings
-Statistics on AI data privacy in the today’s world
-Tips on how to overcome data privacy challenges
-Benefits of AI data security investments.
Keep up-to-date on how AI is reshaping privacy standards and what this entails for both individuals and organizations.
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfAbi john
Analyze the growth of meme coins from mere online jokes to potential assets in the digital economy. Explore the community, culture, and utility as they elevate themselves to a new era in cryptocurrency.
Technology Trends in 2025: AI and Big Data AnalyticsInData Labs
At InData Labs, we have been keeping an ear to the ground, looking out for AI-enabled digital transformation trends coming our way in 2025. Our report will provide a look into the technology landscape of the future, including:
-Artificial Intelligence Market Overview
-Strategies for AI Adoption in 2025
-Anticipated drivers of AI adoption and transformative technologies
-Benefits of AI and Big data for your business
-Tips on how to prepare your business for innovation
-AI and data privacy: Strategies for securing data privacy in AI models, etc.
Download your free copy nowand implement the key findings to improve your business.
Spark is a powerhouse for large datasets, but when it comes to smaller data workloads, its overhead can sometimes slow things down. What if you could achieve high performance and efficiency without the need for Spark?
At S&P Global Commodity Insights, having a complete view of global energy and commodities markets enables customers to make data-driven decisions with confidence and create long-term, sustainable value. 🌍
Explore delta-rs + CDC and how these open-source innovations power lightweight, high-performance data applications beyond Spark! 🚀
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersToradex
Toradex brings robust Linux support to SMARC (Smart Mobility Architecture), ensuring high performance and long-term reliability for embedded applications. Here’s how:
• Optimized Torizon OS & Yocto Support – Toradex provides Torizon OS, a Debian-based easy-to-use platform, and Yocto BSPs for customized Linux images on SMARC modules.
• Seamless Integration with i.MX 8M Plus and i.MX 95 – Toradex SMARC solutions leverage NXP’s i.MX 8 M Plus and i.MX 95 SoCs, delivering power efficiency and AI-ready performance.
• Secure and Reliable – With Secure Boot, over-the-air (OTA) updates, and LTS kernel support, Toradex ensures industrial-grade security and longevity.
• Containerized Workflows for AI & IoT – Support for Docker, ROS, and real-time Linux enables scalable AI, ML, and IoT applications.
• Strong Ecosystem & Developer Support – Toradex offers comprehensive documentation, developer tools, and dedicated support, accelerating time-to-market.
With Toradex’s Linux support for SMARC, developers get a scalable, secure, and high-performance solution for industrial, medical, and AI-driven applications.
Do you have a specific project or application in mind where you're considering SMARC? We can help with Free Compatibility Check and help you with quick time-to-market
For more information: https://www.toradex.com/computer-on-modules/smarc-arm-family
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025BookNet Canada
Book industry standards are evolving rapidly. In the first part of this session, we’ll share an overview of key developments from 2024 and the early months of 2025. Then, BookNet’s resident standards expert, Tom Richardson, and CEO, Lauren Stewart, have a forward-looking conversation about what’s next.
Link to recording, transcript, and accompanying resource: https://bnctechforum.ca/sessions/standardsgoals-for-2025-standards-certification-roundup/
Presented by BookNet Canada on May 6, 2025 with support from the Department of Canadian Heritage.
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxJustin Reock
Building 10x Organizations with Modern Productivity Metrics
10x developers may be a myth, but 10x organizations are very real, as proven by the influential study performed in the 1980s, ‘The Coding War Games.’
Right now, here in early 2025, we seem to be experiencing YAPP (Yet Another Productivity Philosophy), and that philosophy is converging on developer experience. It seems that with every new method we invent for the delivery of products, whether physical or virtual, we reinvent productivity philosophies to go alongside them.
But which of these approaches actually work? DORA? SPACE? DevEx? What should we invest in and create urgency behind today, so that we don’t find ourselves having the same discussion again in a decade?
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxshyamraj55
We’re bringing the TDX energy to our community with 2 power-packed sessions:
🛠️ Workshop: MuleSoft for Agentforce
Explore the new version of our hands-on workshop featuring the latest Topic Center and API Catalog updates.
📄 Talk: Power Up Document Processing
Dive into smart automation with MuleSoft IDP, NLP, and Einstein AI for intelligent document workflows.
Generative Artificial Intelligence (GenAI) in BusinessDr. Tathagat Varma
My talk for the Indian School of Business (ISB) Emerging Leaders Program Cohort 9. In this talk, I discussed key issues around adoption of GenAI in business - benefits, opportunities and limitations. I also discussed how my research on Theory of Cognitive Chasms helps address some of these issues
3. Open-source applications -- or at least some of them -- are good. Scripting languages are good. Python is the best scripting language. Executive Summary Time, money, and innovative energy... can be saved. Productivity, speed, and quality ... can be improved.
4. Topics Scripting languages Python Issues surrounding use of Python Open-Source Software What the experts think Where we might find Python useful
5. What is a "scripting" language? Interpreted requires a run-time interpreter or virtual machine Untyped or dynamically typed No data declarations No compilation step
6. In the beginning... System programming languages Assembler, C Fortran, Cobol, Algol PL/1, Pascal, Basic, C++, Java Command languages JCL TSO CLists, CMS "execs" Batch files
8. Scripting languages evolve UNIX shell scripting languages Rexx Tcl, TK Perl Python PHP Ruby SAS Many have higher-level object-oriented features that make them powerful application development languages in their own right.
10. "Scripting: Higher Level Programmingfor the 21st Century" by John K. Ousterhout IEEE Computer magazine, March 1998 -- http://home.pacbell.net/ouster/scripting.html Assembly Language One machine instruction per line System programming languages 3-7 machine instructions per line Scripting languages Hundreds to thousands of instructions per line Programmers can write roughly the same number of lines of code per year regardless of language. Productivity = the number of machine instructions that a programmer can produce per year.
11. None Strong Degree of Typing Assembler System Languages Scripting Languages VB Python, Perl, Ruby, TCL C C++ Java Instructions/Statement 1000 100 10 1 Language Levels and Productivity From "Scripting: Higher Level Programming for the 21st Century" by John K. Ousterhout. This version prepared by Dana Moore and updated by Stephen Ferg
13. John Ousterhout Scripting: Higher Level Programmingfor the 21st Century - IEEE Computer 1998 Scripting languages represent a different set of tradeoffs than system programming languages. They give up execution speed and strong typing but provide significantly higher programmer productivity and software reuse. This tradeoff makes increasing sense as computers become faster and cheaper compared to programmers. For the last fifteen years a fundamental change has been occurring in the way people write computer programs. ...from system programming languages to scripting languages. This article explains why scripting languages will handle many of the programming tasks of the next century better than system programming languages.
14. Robert C. Martin I think there is a trend in language that will become more and more evident as the decade progresses. I think we are seeing an end to the emphasis on statically typed languages like C++, Java, Eiffel, Pascal, and Ada. I expect to see an ever increasing use of dynamically typed languages, such as Python, Ruby, and even Smalltalk. These languages, and languages of their kind, will be mainstream industrial languages in the coming years.
15. Tim O'Reilly People are so stuck in the personal computer paradigm that they don't recognize that the nature of applications has undergone a profound change in the last decade, with most of the new killer apps running on what has been called the LAMP platform (Linux-Apache-MySQL-PHP | Perl | Python). People understand the importance of Linux, Apache and MySQL... but they still struggle with understanding the "P" in LAMP. The reason why dynamic languages like Perl, Python, and PHP are so important is key to understanding the paradigm shift. Unlike applications from the previous paradigm, web applications are not released in one to three year cycles. They are updated every day, sometimes every hour. Why Scripting Languages Matter
16. Agile programming languages We really should stop calling them "scripting" languages. "Agile" languages would be more accurate. Kevin Altis and Ward Cunningham
17. Python - a great agile programming language! Powerful Easy-to-learn Easy-to-use Open-Source "Python" from "Monty Python's Flying Circus"
18. Python language features Derived from ABC, Modula-3, and C Object-Oriented Dynamically typed Interpreted Cross-platform (Unix, Windows, etc.) Extensible Flexible
19. Python's most obvious feature Uses indentation as a control structure no DO.. END no BEGIN..END no { .. }
20. Indentation as a control-structure for i in range(20): if i%3 == 0: print i if i%5 == 0: print "Bingo!" print "---" 0 Bingo! --- --- --- 3 --- --- --- 6 --- --- --- 9 --- --- --- 12 --- --- --- 15 Bingo! --- --- --- 18 --- ---
21. Sample Python Code See the handouts distributed with this presentation. For a quick overview of Python's features: http://www.ferg.org/python_slides/index.html
22. Python's advantages Productivity and Ease-Of-Use Maintainability Flexibility (OO & functional programming) Power Plays well with other languages Jython compiles to Java byte code Extensible Easy to extend with, or call from, C
23. Python – some history Developed by Guido van Rossum in 1991. A fan of Monty Python's Flying Circus
24. Guido van Rossum Python BDFL - Benevolent Dictator for Life 1999 - Dr. Dobb's Journal Excellence in Programming Awards 2002 – Free Software Foundation (FSF) Award for the Advancement of Free Software
25. "Doctor Fun has the dubious distinction of being the first web cartoon. Doctor Fun was not, however, the first cartoon on the Internet." - http:// www.ibiblio.org/Dave/index.html
27. Who is using Python? What are they doing with it? Industrial Light & Magic , maker of the Star Wars films, uses Python extensively in the computer graphics production process. Disney Feature Length Animation uses Python for its animation production applications.
28. Google , a leading internet search engine, is powered by Python. Yahoo uses Python for its groups site. The Inktomi (formerly Infoseek, now part of Yahoo) search engine uses Python. IBM and Philips have used Python to create the business practice logic for factory tool control applications.
29. NASA uses Python in several large projects, including a CAD/CAM system and a graphical workflow modeler used in planning space shuttle missions. The National Institutes of Health (USA) and Case Western Reserve University are building cutting-edge genetic analysis software with Python.
30. The National Weather Service (USA) uses Python to prepare weather forecasts. Python is also used for this purpose at the Swedish Meteorological and Hydrological Institute and at TV4 Sweden . Chandler , the new open-source cross-platform Personal Information Manager being developed by Mitch Kapor, is being written in Python and wxWindows.
31. Lawrence Livermore National Laboratories is basing a new numerical engineering environment on Python. The Theoretical Physics Division at Los Alamos National Laboratory uses Python to control large-scale physics codes on massively parallel supercomputers, high-end servers, and clusters.
32. US Navy uses Python & Zope for a web based workflow system US Dept. of Agriculture - Python & Zope for massive collaboration Should we be using Python? ....
33. Issues to Consider when Evaluating a Programming Language Let's look at some...
34. Are capabilities an issue? "Batteries included" philosophy Standard distribution includes extensive module library Many other modules available Frank Stajano
35. The Python Standard Library GUI strings regular expressions database connectivity HTTP, CGI, HTML, XML numeric processing debugger object persistence
36. Is execution speed an issue? Modern processors generally make language speed a non-issue Many applications are limited by speed of database or network connection, not programming language Ease-of-use makes implementing optimization algorithms easier – possible to beat even C programs Easy to write interface to C extension modules for optimization Probably 10 times slower than a system language, but ...
37. "In terms of run time and memory consumption, scripting languages often turn out better than Java and not much worse than C or C++." An empirical comparison of C, C++, Java, Perl, Python, Rexx, and Tcl for a search/string-processing program University of Karlsruhe, Germany Technical Report 2000-5, March 10, 2000 http:// wwwipd.ira.uka.de/~prechelt/Biblio/jccpprtTR.pdf
38. "It might seem that the typeless nature of scripting languages could allow errors to go undetected, but in practice scripting languages are just as safe as system programming languages." "Scripting: Higher Level Programmingfor the 21st Century" by John K. Ousterhout IEEE Computer magazine, March 1998 -- http://home.pacbell.net/ouster/scripting.html Is dynamic typing an issue?
39. "I'd been a statically typed bigot for quite a few years. Four years ago I got involved with Extreme Programming. ... I liked the emphasis it put on testing. About two years ago I noticed I was depending less and less on the type system for safety. My unit tests were preventing me from making type errors. So I tried writing some applications in Python, and then Ruby. I found that type issues simply never arose. My unit tests kept my code on the straight and narrow. I simply didn't need static type checking." Robert C. Martin
40. Is productivity an issue? "5-10 times productivity (really!)" Bruce Eckel " I find that I'm able to program about three times faster [in Python] than I could in Java, and I was able to program in Java about three times faster than I could in C ." Andy Hertzfeld "The results indicate that, for the given programming problem, 'scripting languages' (Perl, Python, Rexx, Tcl) are more productive than conventional languages." University of Karlsruhe, Germany, Technical Report 2000-5, March 2000
41. When a 20,000 line project went to approximately 3,000 lines overnight, and came out being more flexible and robust ... I realized I was on to something really good. -- Matthew "Glyph" Lefkowitz ...the lines of Python code were 10% of the equivalent C++ code. -- Greg Stein , eShop
42. "Programming is fun again!" Over and over on comp.lang.python there are messages: "Now that I've discovered Python, I enjoy programming again!" "Now I am back programming in Java because the projects I'm working on call for it. But I wish I was programming in Ruby or Python ..." Robert C. Martin
43. Is maintainability an issue? "I realized that the flexibility of dynamically typed languages makes writing code significantly easier. Modules are easier to write, and easier to change." Robert C. Martin -- http://www.artima.com/weblogs/viewpost.jsp?thread=4639 "Python excels at rapid creation of maintainable code" Bruce Eckel
44. Is weirdness an issue? "Python's use of whitespace stopped feeling unnatural after about twenty minutes. I just indented code, pretty much as I would have done in a C program anyway, and it worked." Eric S. Raymond
45. How weird is it, really? "Most people who use Python consider the indentation syntax to be an important, if not downright critical, feature of the language. It forces you to write readable code, which in turn fosters code maintainability. It's a big win, once you get past the initial shock. In any structured programming language, the indentation of blocks really does mean something. Most Python users think that enforcing consistency in indentation is not only good software engineering, it's simple common sense. The end result is code that is so well laid out that it resembles something akin to poetry." Mark Lutz, author of Programming Python
46. Is support an issue? Python is an "open-source" language. It has no vendor. Does that mean we'll have support problems? What about... Vendor longevity? Consulting & training support? Books and reference materials? Tools? IDEs, debuggers, screen-painters?
47. What is "Open-Source"? a distribution license for source code source code is available without $$$ charge code may be changed, customized, enhanced GPL – Gnu Public License Python license – unlike the GPL, you may distribute a modified version without making your changes open source. a development style and a culture...
48.
49. The Cathedral & the Bazaar Linux is subversive. Who would have thought ... that a world-class operating system could coalesce as if by magic out of part-time hacking by several thousand developers scattered all over the planet, connected only by the tenuous strands of the Internet? Certainly not I... I believed there was a certain critical complexity above which a more centralized, a priori approach was required. ... the most important software needed to be built like cathedrals, carefully crafted by individual wizards or small bands of mages working in splendid isolation, with no beta to be released before its time.
50. The Cathedral & the Bazaar The Linux style of development came as a surprise. No quiet, reverent cathedral-building here—rather, the Linux community seemed to resemble a great babbling bazaar of differing agendas and approaches out of which a coherent and stable system could seemingly emerge only by a succession of miracles. The Linux world not only didn't fly apart in confusion -- it seemed to go from strength to strength at a speed barely imaginable to cathedral-builders.
51. Some open-source products Linux Apache MySql PHP | Perl | Python Apache has overwhelmingly dominated the Web server market since 1996. PHP is the most popular Apache module, running on almost 10 million domains (over a million IP addresses). "MySQL threatens to do for databases what Linux has done for operating systems." – Tim O'Reilly "LAMP"
52. Is Open-Source software used in the Federal Government? See earlier list of Python users NIH, NASA, Navy, Agriculture, Weather Service In 2002, a Mitre study found 115 FOSS products in use in DoD http://egovos.org/pdf/dodfoss.pdf Why would a Federal agency use open-source software? ...
53. Government Computer News November 20, 2000 The NASA Acquisition Internet Service (NAIS) development team adopted open-source software several years ago and we plan to expand its use in the agency-wide procurement system. We were using a proprietary Web development application that promised interoperability with another vendor’s database software. It failed to interoperate, however... Then we discovered Perl and have been using it for the last five years to develop and support all NAIS applications. Recently, price restructuring for a commercial DBMS threatened to consume most of the NAIS budget. We decided to convert NAIS to MySQL. Our tests showed MySQL could perform NAIS functions faster. Cost of the optional technical support was about 1 percent of that for the commercial product. Technical support for MySQL has been excellent when we needed it, plus there are hundreds of Web sites that offer free help and support for such open-source products. We plan to evaluate the Apache HTTP Server to correct limitations of the commercial Web server we currently use.
54. eGov & Open-Source Center of Open Source & Government ( http://egovos.org/index.html ) EGOVOS - high-level international conference on OSS ("Libre Software"), interoperability and open standards in government October 2002 & March 2003 - Washington, DC EGOVOS3 - 24-26 November, 2003 at UNESCO headquarters in Paris
55. The Open Source Reference Book 2003 - What Local/National Governments, the Defense Establishment, and The Global 1000 Need To Know About Open Source Software ( November 2003) ... will provide a Generally Regarded As Safe (GRAS) list of Open Source software to identify mature and useable Open Source projects ... will list Open Source software that is NIAP* or Common Criteria evaluated *NIAP: National Information Assurance Partnership – NIST security certification
56. So... Is Open-Source Safe ? Vendors and products vary widely in both the commercial and open-source arena. The fact that a piece of software is commercial is no guarantee of its quality, or of its vendor's long-term survival. The best open-source software is as good as the best commercial software.
57. Each product and vendor should be evaluated on its own merits, regardless of whether it is commercial or open-source. Python is in the same league as the best software anywhere, commercial or open-source. The Bottom Line
59. The Python Software Foundation A non-profit organization for advancing open-source technology related to Python Holds Python's intellectual property rights. Produces the core Python distribution, available to the public free of charge. Establishes PSF licenses, ensuring the rights of the public to freely obtain, use, redistribute, and modify intellectual property held by the PSF.
60. Is mindshare an issue? International Python Conference IPC - in USA since 1992 EuroPython conference in Europe since 2002 Python for Scientific Computing Workshop SciPy - in USA since 2002 The Python community is very active and growing rapidly
61. Newsgroup Activity comp.lang.* December 2002 java 26953 c++ 19913 c 13874 perl 10486 python 9647 basic 7909 ruby 6466 lisp 6132 tcl 5256 pascal 4229 smalltalk 2398 fortran 2355 cobol 1845 Statistics compiled by Aaron K. Johnson.
62. TIOBE Popularity of Programming Languages Index July 2003 based on the number of hits returned by a Google search 1 Java 44.3 2 C 36.8 3 C++ 33.2 4 Perl 18.3 5 (Visual) Basic 15.5 6 PHP 7.6 7 SQL 6.0 8 C# 3.5 9 JavaScript 3.3 10 Delphi/Pascal/Kylix 3.1 11 Python 2.6 12 COBOL 2.3 13 SAS 2.2 14 Fortran 1.9 Index is available at http://www.tiobe.com/tpci.htm
63. Is online support an issue? comp.lang.python -- Outstanding!! http://groups.google.com/groups ?&group=comp.lang.python
64. Consulting and Training Resources? Not much! Python is probably too easy-to-learn and easy-to-use to support much of a training/ consulting industry. You can learn it out of a book! A couple of useful consulting resources... Zope Corp. Fourthought , Inc. - XML tools for Python and XML and web-based applications.
65. Is Ease-of-Learning an issue? Python is famously easy to use and easy to learn. I talked my colleagues into using Python for our Computer Science 1 course this fall. ... In the past I would be swamped during office hours with students wanting help deciphering C++ compiler errors. This semester almost nobody has stopped by for syntax issues. -- Dave Reed on Python In Education mailing list
67. Online Materials? Python distribution includes: Tutorial, Language Reference Extensive Standard Library documentation "How to Think Like a Computer Scientist with Python" http://greenteapress.com/thinkpython/ "Python Programming – an Introduction to Computer Science" http://mcsp.wartburg.edu/zelle/python/ "Dive Into Python" http://diveintopython.org/index.html Too many others to list...
68. Tools? - IDEs IDLE comes with Python WingIDE – excellent IDE with visual debugger $35 and $180 -- http://wingide.com/
69. Visual Python Python plug-in for Visual Studio .NET. Python-specific features within the familiar Visual Studio environment. Visual Python integrates seamlessly with Visual Studio .NET, allowing programmers to leverage features of Microsoft's popular development tool suite. http://www.activestate.com/Products/Visual_Python
72. Bruce Eckel His book Thinking in C++ was given the Software Development Jolt Award for best book published in 1995. Thinking in Java received Java World Reader's Choice Award and Java World Editor's Choice Award for best book, the Java Developer's Journal Editor's Choice Award for books, the Software Development Productivity Award in 1999, the third edition received the Software Development Magazine Jolt award for best technical book, 2002. One of "the industry's leading lights" ( Windows Tech Journal , September 1996).
74. The language you speak affects what you can think. "Python fits my brain." Python excels at rapid creation of maintainable code Programmer productivity is the most important thing. 5-10 times productivity (really!)
75. Simplicity really does make a difference. I can remember many Python idioms because they’re simpler. One more reason I program faster in Python. I still have to look up how to open a file every time I do it in Java.
76. Python & “The Tipping Point” It is possible to write programs to automate every task. But you don’t. Python makes it easy enough
77. Eric S. Raymond The Cathedral and the Bazaar www.catb.org/~esr/writings/cathedral -bazaar/cathedral-bazaar/ The New Hacker's Dictionary http://www.jargon.8hz.com/jargon_toc.html Homesteading the Noosphere http://www.firstmonday.dk/issues/issue3_10/raymond/ Well-known Unix guru, Linux advocate, and author
78. This could be an opportunity to get some hands-on experience with Python... I charged ahead and roughed out some code ... http://pythonology.org/success&story=esr Why Python?
79. I noticed I was generating working code nearly as fast as I could type. When you're writing working code nearly as fast as you can type, it generally means you've achieved mastery of the language. But that didn't make sense, because it was still day one ... This was my first clue that, in Python, I was actually dealing with an exceptionally good design.
80. Not that it took me very long to learn the feature set. This reflects another useful property of Python: it is compact -- you can hold its entire feature set (and at least a concept index of its libraries) in your head.
81. The long-term usefulness of a language comes from how well and how unobtrusively it supports the day-to-day work of programming, which consists not of writing new programs, but mostly reading and modifying existing ones. So the real punchline of the story is this: weeks and months after writing fetchmailconf [my Python program], I could still read the code and grok what it was doing without serious mental effort.
82. Martin C. Brown Author and Perl expert Perl: The Complete Reference Perl Annotated Archives ... and ... ... and ...
83. Nicholas Petreley ComputerWorld columnist One of my favorite programming languages is Python. It seems I don't go a week these days without someone asking me what I know about Python, so it seems to be gaining quite a following in mainstream IT. November, 2002
84. The Bottom Line... "Use the Best Tool for the Job: Put Both a Scripting and Systems Language in Your Toolbox"- Bill Venners http://www.artima.com/commentary/langtool.html Python would be a useful tool in the toolboxes of our developers, DBAs, and LAN administrators, for situations where....
85. A command-language is too under-powered, and a systems programming language would be overkill. Speed and minimizing effort are important One-time, throw-away programs Internal utilities Prototyping Test scaffolding
86. Cross-platform portability is important System administrators need learn only one scripting language Prototype/develop on one platform, deploy on another (e.g. Windows NT and Unix) Readability & maintainability are important XML processing The Python people also piped up to say “everything's just fine here” but then they always do. I really must learn that language. XML Is Too Hard For Programmers Tim Bray, co-author of the original XML 1.0 spec
87. Ease-of-learning is important An application is written in four different languages (Java, C, Perl, and Unix shell-script) because it was built by four different developers who were expert in four different languages. Everybody knows this is a problem, but nobody has time to learn another language. One solution -- a single common language that is both powerful enough to handle a wide variety of tasks, and easy enough to learn quickly and easily.
88. More Online Information http://www.python.org is the Python home page Chapter 1 of Internet Programming with Python is available online. It discusses reasons for using Python. http://www.fsbassociates.com/books/pythonchpt1.htm Python Compared to Other Languages http:// www.python.org/doc/Comparisons.html
#2: Topic: Introduction to Python, for IT managers Revision Date: 2003-07-17 Filename: python_intro_for_managers.ppt URL: http://www.ferg.org/python_presentations/index.html Author: Stephen Ferg ([email protected]) This presentation introduces IT managers to Python. It pays special attention to issues that concern IT management in Federal government agencies. Many of the slides have presenters notes. If you plan to give this presentation, I suggest you print of the slideshow in a format that shows the notes, and read them. They may give you some ideas about point to be made, further resources, etc. You are free to use this file and modify it as you wish. (In particular, of course, you should replace my name and organization with your own on the opening slide!) Acknowledgements are not required – keep the audience focused on Python, not on who developed the slides. The wording of some of the quotations has been slightly modified from the original document to accommodate the need for compression in a slide presentation. But in all cases the wording of the quotations is substantially the same as it was in the original, and the point of a quote has never been altered. I have always tried to include a reference to the source of a quotation, so anyone may consult the original document if they wish to do so. -- Steve Ferg