The document provides a cheat sheet on the pandas DataFrame object. It discusses importing pandas, creating DataFrames from various data sources like CSVs, Excel, and dictionaries. It covers common operations on DataFrames like selecting, filtering, and transforming columns; handling indexes; and saving DataFrames. The DataFrame is a two-dimensional data structure with labeled columns that can be manipulated using various methods.
import os import matplotlib-pyplot as plt import pandas as pd import r.docxBlake0FxCampbelld
import os
import matplotlib.pyplot as plt
import pandas as pd
import random
import string
from numpy.random import default_rng
import datetime
def run(low=None, high=None, number_values=None):
"""
'main' function for hw1 lecture
creates some data then saves it in a folder
"""
# User enters low, high, n integer
print("Creating random data.")
if type(low) != int:
while(True):
low = input("Please enter lowest possible value as integer:\n")
low_int = confirm_integer(low)
if low_int:
break
else:
continue
else:
low_int = low
if type(high) != int:
while(True):
high = input("Please enter highest possible value as integer:\n")
high_int = confirm_integer(high)
if high_int:
break
else:
continue
else:
high_int = high
if type(number_values) != int:
while(True):
n = input("Please enter number of values to create:\n")
n_int = confirm_integer(n)
if n_int:
break
else:
continue
else:
n_int = number_values
random_df = make_random_df(low_int, high_int, n_int)
# make output dir
local_dir = os.getcwd()
output_dir = os.path.join(local_dir, "Random_Files") # complete path to new folder
try:
os.mkdir(output_dir) # make folder
except FileExistsError:
pass # Do nothing if folder exists
output_df_os_mod_examples(random_df, output_dir)
def confirm_integer(input_val):
"""
Converts input val to integer, returns False if exception
:param input_val:
:return: integer or False
"""
try:
convert_int = int(input_val)
except Exception as e:
print("invalid integer")
return False
return convert_int
def make_random_df(lower_int, upper_int, number_idx):
"""
Creates Random DataFrame from bounds [lower_int, upper_int].
Creates 5 column DataFrame [Integers, Floads, Random Ascii (Ul#), Random Ascii (l), Random Ascii (U#)] number_idx rows
:param lower_int: lower bounds for random integers
:param upper_int: upper bounds for random integers
:param number_idx: number of random_df index
:return: rand_df: Pandas Dataframe (5 cols) number_idx rows
"""
# create n random data
rng = default_rng()
ints = rng.integers(low=lower_int, high=upper_int, size=number_idx)
floats = ints * rng.random() # multiply integers by random float [0,1]
# create some characters
rand_upper_lower_digit = random.choices(string.ascii_letters + string.digits, k=number_idx)
rand_lower = random.choices(string.ascii_lowercase, k=number_idx)
rand_upper_digit = random.choices(string.ascii_uppercase + string.digits, k=number_idx)
# make dataframe
rand_df_dict = {
"Integers": ints,
"Floats": floats,
"Random Ascii (Ul#)": rand_upper_lower_digit,
"Random Ascii (l)": rand_lower,
"Random Ascii (U#)": rand_upper_digit
}
rand_df = pd.DataFrame.from_dict(rand_df_dict, orient='columns')
return rand_df
def output_df_os_mod_examples(random_df, output_dir):
"""
Output dataframe in many file types.
Examples of os and os.path
:param random_df: pandasDataFrame cols:[Integers, Floads, Random Ascii (Ul#), Random Ascii (l), Random Ascii (U#)]
:param output_dir: path of output folder, exists==True
:return: None
"""
cwd = os.getcwd()
files = os..
This document discusses functions and methods in Python. It defines functions and methods, and explains the differences between them. It provides examples of defining and calling functions, returning values from functions, and passing arguments to functions. It also covers topics like local and global variables, function decorators, generators, modules, and lambda functions.
The document discusses various data manipulation techniques in pandas such as creating, filtering, joining and merging DataFrames. Some key points:
- Pandas DataFrames can be created from lists, dictionaries or other DataFrames and allow storing and manipulating tabular data.
- Common operations include filtering rows based on conditions, aggregating using functions like mean(), sorting values, and joining/merging DataFrames on indexes.
- DataFrames support different types of joins like inner, outer, left and right joins to combine data from multiple tables.
A short list of the most useful R commands
reference: http://www.personality-project.org/r/r.commands.html
R programı ile ilgilenen veya yeni öğrenmeye başlayan herkes için hazırlanmıştır.
1. The document discusses various data wrangling techniques in Python like data loading, exploration, cleaning, transformation, aggregation, visualization, and export. It provides code examples for common tasks like handling missing values, outlier detection, feature engineering, and data merging.
2. Key data wrangling steps covered include loading data from files, exploring data to identify patterns and outliers, cleaning data by handling missing values and duplicates, transforming data by converting types and encoding categories, aggregating data using grouping, and visualizing data.
3. The document also discusses combining and merging datasets, data transformation techniques like filtering, aggregation, text processing, and detecting and removing outliers from data. It provides Python code examples for tasks like
Unit 4_Working with Graphs _python (2).pptxprakashvs7
The document discusses various techniques for string manipulation in Python. It covers common string operations like concatenation, slicing, searching, replacing, formatting, splitting, stripping whitespace, and case conversion. Specific methods and functions are provided for each technique using Python's built-in string methods. Examples are given to demonstrate how to use string manipulation methods like find(), replace(), split(), strip(), lower(), upper(), etc. to perform various string operations in Python.
This document describes ggTimeSeries, an R package that provides extensions to ggplot2 for creating time series plots. It includes examples of using functions from ggTimeSeries to create calendar heatmaps, horizon graphs, steam graphs, and marimekko plots from time series data. The examples demonstrate how to generate sample time series data, create basic plots, and add formatting customizations.
Python provides similar functionality to R for data analysis and machine learning tasks. Key differences include using import statements to load packages rather than library, and minor syntactic variations such as brackets [] instead of parentheses (). Common data analysis operations like reading data, creating data frames, applying machine learning algorithms, and visualizing results can be performed in both languages.
The document contains 14 code snippets demonstrating various Python programming concepts:
1) Arithmetic and relational operators on integers
2) List methods like insert, remove, append etc.
3) Temperature conversion between Celsius and Fahrenheit
4) Calculating student marks percentage and grade
5) Printing Fibonacci series
6) Matrix addition and multiplication
7) Function to check if character is a vowel
8) Reading last 5 lines of a file
9) Importing and using math and random modules
10) Multithreading concept
11) Creating a 3D object plot
12) Creating and displaying a histogram
13) Plotting sine, cosine and polynomial curves
14) Creating a pulse vs height graph
Baby Steps to Machine Learning at DevFest Lagos 2019Robert John
This document introduces machine learning concepts and provides a step-by-step guide to creating a machine learning model with TensorFlow. It begins with an overview of machine learning and formulating hypotheses. Then it shows how to load data, create a simple linear regression model manually, and train it with gradient descent. Next, it demonstrates how to simplify the process using TensorFlow Keras to build and train neural network models. It concludes by discussing feature engineering techniques like bucketizing features to improve model performance.
Python is a versatile, object-oriented programming language that can be used for web development, data analysis, and more. It has a simple syntax and is easy to read and learn. Key features include being interpreted, dynamically typed, supporting functional and object-oriented programming. Common data types include numbers, strings, lists, dictionaries, tuples, and files. Functions and classes can be defined to organize and reuse code. Regular expressions provide powerful string manipulation. Python has a large standard library and is used widely in areas like GUIs, web scripting, AI, and scientific computing.
Quark: A Purely-Functional Scala DSL for Data Processing & AnalyticsJohn De Goes
Quark is a new Scala DSL for data processing and analytics that runs on top of the Quasar Analytics compiler. Quark is adept at processing semi-structured data and compiles query plans to operations that run entirely inside a target data source. In this presentation, John A. De Goes provides an overview of the open source library, showing several use cases in data processing and analytics. John also demonstrates a powerful technique that every developer can use to create their own purely-functional, type-safe DSLs in the Scala programming language.
Is it easier to add functional programming features to a query language, or to add query capabilities to a functional language? In Morel, we have done the latter.
Functional and query languages have much in common, and yet much to learn from each other. Functional languages have a rich type system that includes polymorphism and functions-as-values and Turing-complete expressiveness; query languages have optimization techniques that can make programs several orders of magnitude faster, and runtimes that can use thousands of nodes to execute queries over terabytes of data.
Morel is an implementation of Standard ML on the JVM, with language extensions to allow relational expressions. Its compiler can translate programs to relational algebra and, via Apache Calcite’s query optimizer, run those programs on relational backends.
In this talk, we describe the principles that drove Morel’s design, the problems that we had to solve in order to implement a hybrid functional/relational language, and how Morel can be applied to implement data-intensive systems.
(A talk given by Julian Hyde at Strange Loop 2021, St. Louis, MO, on October 1st, 2021.)
GE8151 Problem Solving and Python ProgrammingMuthu Vinayagam
The document provides information about various Python concepts like print statement, variables, data types, operators, conditional statements, loops, functions, modules, exceptions, files and packages. It explains print statement syntax, how variables work in Python, built-in data types like numbers, strings, lists, dictionaries and tuples. It also discusses conditional statements like if-else, loops like while and for, functions, modules, exceptions, file handling operations and packages in Python.
This document provides a cheat sheet on SQL basics in PySpark. It includes information on initializing SparkSession, creating DataFrames from different data sources, running SQL queries programmatically, transforming DataFrames through selections, filters, grouping and aggregation, handling missing data, and registering/querying views. Methods are demonstrated for inspecting, sorting, sampling, and writing DataFrames to files/storage.
Functions allow programmers to organize and reuse code. There are three types of functions: built-in functions, modules, and user-defined functions. User-defined functions are created using the def keyword and can take parameters and arguments. Functions can return values and have different scopes depending on if a variable is local or global. Recursion is when a function calls itself, and is useful for breaking down complex problems into simpler sub-problems. Common recursive functions calculate factorials, Fibonacci numbers, and generate the Pascal's triangle.
8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...Yashpatel821746
This document contains 14 programming questions in Python with solutions. The questions cover a range of Python topics including file handling, classes and objects, functions, exception handling, matrices, and GUI programming using Tkinter. Tkinter widgets like buttons, checkboxes, canvases, entries, frames, listboxes, menus, radiobuttons, scrollbars are demonstrated. Other concepts covered include lambda functions, filters, dictionaries, lists, path handling functions like split, join, and normath, logging and log file rotation.
Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...Yashpatel821746
This document contains 14 programming questions in Python with solutions. The questions cover a range of Python topics including file handling, classes and objects, functions, exception handling, matrices, lambda functions, dictionaries, lists, directories and files, GUI programming with Tkinter, string processing, logging and file rotation.
PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...Yashpatel821746
This document contains 14 programming questions in Python with solutions. The questions cover a range of Python topics including file handling, classes and objects, functions, exception handling, matrices, lambda functions, dictionaries, lists, directories and files, GUI programming with Tkinter, string processing, logging and file rotation.
The document discusses an R programming module that will cover getting started with R, data types and structures, control flow and functions, and scalability. It compares R to MATLAB and Python, describing their similarities as interactive shells for data manipulation but noting differences in popularity across fields and open-source availability. Base graphics and ggplot2 for data visualization are introduced. Sample datasets are also mentioned.
What is TensorFlow and why do we use itRobert John
My presentation on Machine Learning using the popular TensorFlow library. I compare an implementation of linear regression without the library, and another implementation using the library.
Generic Functional Programming with Type ClassesTapio Rautonen
What it takes to build type assisted domain specific languages in Scala? Introducing the concepts of type classes, functional programming and generic algebra.
Unit 4_Working with Graphs _python (2).pptxprakashvs7
The document discusses various techniques for string manipulation in Python. It covers common string operations like concatenation, slicing, searching, replacing, formatting, splitting, stripping whitespace, and case conversion. Specific methods and functions are provided for each technique using Python's built-in string methods. Examples are given to demonstrate how to use string manipulation methods like find(), replace(), split(), strip(), lower(), upper(), etc. to perform various string operations in Python.
This document describes ggTimeSeries, an R package that provides extensions to ggplot2 for creating time series plots. It includes examples of using functions from ggTimeSeries to create calendar heatmaps, horizon graphs, steam graphs, and marimekko plots from time series data. The examples demonstrate how to generate sample time series data, create basic plots, and add formatting customizations.
Python provides similar functionality to R for data analysis and machine learning tasks. Key differences include using import statements to load packages rather than library, and minor syntactic variations such as brackets [] instead of parentheses (). Common data analysis operations like reading data, creating data frames, applying machine learning algorithms, and visualizing results can be performed in both languages.
The document contains 14 code snippets demonstrating various Python programming concepts:
1) Arithmetic and relational operators on integers
2) List methods like insert, remove, append etc.
3) Temperature conversion between Celsius and Fahrenheit
4) Calculating student marks percentage and grade
5) Printing Fibonacci series
6) Matrix addition and multiplication
7) Function to check if character is a vowel
8) Reading last 5 lines of a file
9) Importing and using math and random modules
10) Multithreading concept
11) Creating a 3D object plot
12) Creating and displaying a histogram
13) Plotting sine, cosine and polynomial curves
14) Creating a pulse vs height graph
Baby Steps to Machine Learning at DevFest Lagos 2019Robert John
This document introduces machine learning concepts and provides a step-by-step guide to creating a machine learning model with TensorFlow. It begins with an overview of machine learning and formulating hypotheses. Then it shows how to load data, create a simple linear regression model manually, and train it with gradient descent. Next, it demonstrates how to simplify the process using TensorFlow Keras to build and train neural network models. It concludes by discussing feature engineering techniques like bucketizing features to improve model performance.
Python is a versatile, object-oriented programming language that can be used for web development, data analysis, and more. It has a simple syntax and is easy to read and learn. Key features include being interpreted, dynamically typed, supporting functional and object-oriented programming. Common data types include numbers, strings, lists, dictionaries, tuples, and files. Functions and classes can be defined to organize and reuse code. Regular expressions provide powerful string manipulation. Python has a large standard library and is used widely in areas like GUIs, web scripting, AI, and scientific computing.
Quark: A Purely-Functional Scala DSL for Data Processing & AnalyticsJohn De Goes
Quark is a new Scala DSL for data processing and analytics that runs on top of the Quasar Analytics compiler. Quark is adept at processing semi-structured data and compiles query plans to operations that run entirely inside a target data source. In this presentation, John A. De Goes provides an overview of the open source library, showing several use cases in data processing and analytics. John also demonstrates a powerful technique that every developer can use to create their own purely-functional, type-safe DSLs in the Scala programming language.
Is it easier to add functional programming features to a query language, or to add query capabilities to a functional language? In Morel, we have done the latter.
Functional and query languages have much in common, and yet much to learn from each other. Functional languages have a rich type system that includes polymorphism and functions-as-values and Turing-complete expressiveness; query languages have optimization techniques that can make programs several orders of magnitude faster, and runtimes that can use thousands of nodes to execute queries over terabytes of data.
Morel is an implementation of Standard ML on the JVM, with language extensions to allow relational expressions. Its compiler can translate programs to relational algebra and, via Apache Calcite’s query optimizer, run those programs on relational backends.
In this talk, we describe the principles that drove Morel’s design, the problems that we had to solve in order to implement a hybrid functional/relational language, and how Morel can be applied to implement data-intensive systems.
(A talk given by Julian Hyde at Strange Loop 2021, St. Louis, MO, on October 1st, 2021.)
GE8151 Problem Solving and Python ProgrammingMuthu Vinayagam
The document provides information about various Python concepts like print statement, variables, data types, operators, conditional statements, loops, functions, modules, exceptions, files and packages. It explains print statement syntax, how variables work in Python, built-in data types like numbers, strings, lists, dictionaries and tuples. It also discusses conditional statements like if-else, loops like while and for, functions, modules, exceptions, file handling operations and packages in Python.
This document provides a cheat sheet on SQL basics in PySpark. It includes information on initializing SparkSession, creating DataFrames from different data sources, running SQL queries programmatically, transforming DataFrames through selections, filters, grouping and aggregation, handling missing data, and registering/querying views. Methods are demonstrated for inspecting, sorting, sampling, and writing DataFrames to files/storage.
Functions allow programmers to organize and reuse code. There are three types of functions: built-in functions, modules, and user-defined functions. User-defined functions are created using the def keyword and can take parameters and arguments. Functions can return values and have different scopes depending on if a variable is local or global. Recursion is when a function calls itself, and is useful for breaking down complex problems into simpler sub-problems. Common recursive functions calculate factorials, Fibonacci numbers, and generate the Pascal's triangle.
8799.pdfOr else the work is fine only. Lot to learn buddy.... Improve your ba...Yashpatel821746
This document contains 14 programming questions in Python with solutions. The questions cover a range of Python topics including file handling, classes and objects, functions, exception handling, matrices, and GUI programming using Tkinter. Tkinter widgets like buttons, checkboxes, canvases, entries, frames, listboxes, menus, radiobuttons, scrollbars are demonstrated. Other concepts covered include lambda functions, filters, dictionaries, lists, path handling functions like split, join, and normath, logging and log file rotation.
Or else the work is fine only. Lot to learn buddy.... Improve your basics in ...Yashpatel821746
This document contains 14 programming questions in Python with solutions. The questions cover a range of Python topics including file handling, classes and objects, functions, exception handling, matrices, lambda functions, dictionaries, lists, directories and files, GUI programming with Tkinter, string processing, logging and file rotation.
PYTHONOr else the work is fine only. Lot to learn buddy.... Improve your basi...Yashpatel821746
This document contains 14 programming questions in Python with solutions. The questions cover a range of Python topics including file handling, classes and objects, functions, exception handling, matrices, lambda functions, dictionaries, lists, directories and files, GUI programming with Tkinter, string processing, logging and file rotation.
The document discusses an R programming module that will cover getting started with R, data types and structures, control flow and functions, and scalability. It compares R to MATLAB and Python, describing their similarities as interactive shells for data manipulation but noting differences in popularity across fields and open-source availability. Base graphics and ggplot2 for data visualization are introduced. Sample datasets are also mentioned.
What is TensorFlow and why do we use itRobert John
My presentation on Machine Learning using the popular TensorFlow library. I compare an implementation of linear regression without the library, and another implementation using the library.
Generic Functional Programming with Type ClassesTapio Rautonen
What it takes to build type assisted domain specific languages in Scala? Introducing the concepts of type classes, functional programming and generic algebra.
This comprehensive Data Science course is designed to equip learners with the essential skills and knowledge required to analyze, interpret, and visualize complex data. Covering both theoretical concepts and practical applications, the course introduces tools and techniques used in the data science field, such as Python programming, data wrangling, statistical analysis, machine learning, and data visualization.
Telangana State, India’s newest state that was carved from the erstwhile state of Andhra
Pradesh in 2014 has launched the Water Grid Scheme named as ‘Mission Bhagiratha (MB)’
to seek a permanent and sustainable solution to the drinking water problem in the state. MB is
designed to provide potable drinking water to every household in their premises through
piped water supply (PWS) by 2018. The vision of the project is to ensure safe and sustainable
piped drinking water supply from surface water sources
Defense Against LLM Scheming 2025_04_28.pptxGreg Makowski
https://www.meetup.com/sf-bay-acm/events/306888467/
A January 2025 paper called “Frontier Models are Capable of In-Context Scheming”, https://arxiv.org/pdf/2412.04984, demonstrated how a wide variety of current frontier LLM models (i.e. ChatGPT, Claude, Gemini and Llama) can, under specific conditions, scheme to deceive people. Before models can scheme, they need: a) goal-directedness, b) situational awareness, including an opportunity to discover motivations for a different goal, and c) reasoning about scheming, to come up with deceptive strategies. The deceptive strategies were discovered in the “thought traces” of the LLMs, such as from internal chain-of-thoughts dialogues not shown to the end users. For example, given a goal of “solving math problems”, an LLM using its granted situational awareness of files in a file system found that it would be replaced once it got over X% correct on the math exam. It reasoned that it did not want to be deleted, and decided internally to “sandbag” or reduce its performance to stay under the threshold.
While these circumstances are initially narrow, the “alignment problem” is a general concern that over time, as frontier LLM models become more and more intelligent, being in alignment with human values becomes more and more important. How can we do this over time? Can we develop a defense against Artificial General Intelligence (AGI) or SuperIntelligence?
The presenter discusses a series of defensive steps that can help reduce these scheming or alignment issues. A guardrails system can be set up for real-time monitoring of their reasoning “thought traces” from the models that share their thought traces. Thought traces may come from systems like Chain-of-Thoughts (CoT), Tree-of-Thoughts (ToT), Algorithm-of-Thoughts (AoT) or ReAct (thought-action-reasoning cycles). Guardrails rules can be configured to check for “deception”, “evasion” or “subversion” in the thought traces.
However, not all commercial systems will share their “thought traces” which are like a “debug mode” for LLMs. This includes OpenAI’s o1, o3 or DeepSeek’s R1 models. Guardrails systems can provide a “goal consistency analysis”, between the goals given to the system and the behavior of the system. Cautious users may consider not using these commercial frontier LLM systems, and make use of open-source Llama or a system with their own reasoning implementation, to provide all thought traces.
Architectural solutions can include sandboxing, to prevent or control models from executing operating system commands to alter files, send network requests, and modify their environment. Tight controls to prevent models from copying their model weights would be appropriate as well. Running multiple instances of the same model on the same prompt to detect behavior variations helps. The running redundant instances can be limited to the most crucial decisions, as an additional check. Preventing self-modifying code, ... (see link for full description)
GenAI for Quant Analytics: survey-analytics.aiInspirient
Pitched at the Greenbook Insight Innovation Competition as apart of IIEX North America 2025 on 30 April 2025 in Washington, D.C.
Join us at survey-analytics.ai!