This document discusses various bioinformatics tools and methods for identifying genes from genomic sequences. It begins by defining genes and genomes, then describes reference databases like RefSeq that are important for gene identification. It outlines the general workflow for gene identification, including obtaining sequences, preprocessing, annotation, prediction, and validation. Specific tools mentioned include GENSCAN, Glimmer, and Augustus for gene prediction, and BLAST for sequence alignment. The document also discusses identifying other genomic features like promoters, repeats, and open reading frames. It emphasizes that accurate gene identification requires both computational and experimental approaches.
The Needleman-Wunsch algorithm finds the optimal global alignment of two nucleotide or protein sequences. It works by filling a matrix using a recursive formula that considers the best score from adjacent cells, incorporating substitution scores and gap penalties. This algorithm runs in quadratic time compared to assessing all possible alignments individually, which runs in exponential time. For two sequences of length n, the Needleman-Wunsch algorithm is much faster, taking n^2 time instead of the 2^n time needed to assess all alignments individually.
The Smith-Waterman algorithm finds the best local alignment between two sequences. It involves filling a matrix using a recurrence relation to score matches, mismatches, and gaps. The highest scoring cell represents the best local alignment, which can be traced back through the matrix. For example, the best local alignment between sequences "TCAGTTGCC" and "AGGTTG" is "GTTG" with a score of 4.
The Needleman-Wunsch algorithm is used to perform global sequence alignment to identify similarities between nucleotide or protein sequences. It is an example of dynamic programming that finds the optimal alignment between two sequences in quadratic time. The algorithm initializes a scoring matrix and then fills it using recursion relations, scoring matches, mismatches, and gaps. It then traces back through the matrix to find the highest scoring alignment path and deduce the optimal alignment between the sequences.
Scoring schemes in bioinformatics (blosum)SumatiHajela
This document discusses scoring schemes in bioinformatics, specifically BLOSUM (BLOcks SUbstitution Matrix). It introduces BLOSUM, describing that it is based on conserved amino acid patterns from multiple sequence alignments. It then explains the BLOSUM-62 matrix and the BLOSUM scoring algorithm. The document contrasts BLOSUM with PAM matrices, noting key differences like BLOSUM being based on direct observations while PAM uses evolutionary modeling. Finally, it outlines the significance of scoring matrices for detecting distant evolutionary relationships between protein sequences.
Dr Avril Coghlan discusses the BLAST algorithm for comparing biological sequences and searching databases of DNA and protein sequences. BLAST is a fast heuristic method for sequence alignment and database searching. It works by first finding short words that are common between the query sequence and database sequences, and then extending the alignment around these words. BLAST is able to quickly search very large databases and find significant matches by calculating E-values, which estimate the statistical significance of matches. BLAST allows researchers to determine if a new sequence is similar to any known sequences and predict potential functions.
The S-W algorithm performs in local sequence alignment for determining two similar regions between two strings nucleotide sequences or protein sequence.
Instead of looking for entire sequence, S-W algorithm compares sequence of all possible lengths and optimizes similarity length.
This document summarizes key aspects of sequence alignment. It discusses how sequence alignment involves comparing sequences to find identical or similar characters in the same order. It describes global and local alignment and the algorithms used for each. It also discusses scoring systems for alignments, including penalties for gaps and mismatches. The goals of sequence alignment are to infer functional, structural or evolutionary relationships between sequences.
Protein-protein interactions are important for many biological processes. There are various types of interactions depending on their composition and duration. Methods to study interactions include yeast two-hybrid, co-immunoprecipitation, affinity chromatography, and chromatin immunoprecipitation. Databases such as IntAct and MINT provide repositories for protein interaction data.
Protein-DNA interactions can be either specific or non-specific. Specific interactions involve transcription factors that regulate gene expression by binding to DNA motifs through domains like helix-loop-helix, leucine zipper, or zinc finger motifs. Non-specific interactions involve histones that help structure DNA into nucleosomes within chromatin and can be chemically modified through methylation, demethylation, acetylation, and phosphorylation.
This document discusses Biopython, a Python package for biological data analysis. It provides concise summaries of key Biopython concepts:
1) Biopython is an object-oriented Python package that consists of modules for common biological data operations like working with sequences.
2) Key Biopython classes include Alphabet for sequence alphabets, Seq for representing sequences, SeqRecord for sequences with metadata, and SeqIO for reading/writing sequences to files.
3) Classes specify attributes (data) and methods (functions) that objects can have. For example, Seq objects have attributes like sequence and alphabet, and methods like translate() and complement().
The document discusses protein-protein interactions (PPIs) and methods used to study them. It defines PPIs as physical contacts between two or more proteins through biochemical or electrostatic forces. It describes different types of PPIs including homo-oligomers, hetero-oligomers, covalent and non-covalent interactions. Common methods to study PPIs are also summarized, such as yeast two-hybrid systems, co-immunoprecipitation, and protein interaction databases. The applications and importance of PPI research are mentioned including roles in various cellular processes and diseases.
After sequencing of the genome has been done, the first thing that comes to mind is "Where are the genes?". Genome annotation is the process of attaching information to the biological sequences. It is an active area of research and it would help scientists a lot to undergo with their wet lab projects once they know the coding parts of a genome.
The document describes several key databases within the KEGG resource, including:
- The PATHWAY database containing molecular network maps of metabolic and genetic pathways.
- The BRITE database providing hierarchical classifications of biological systems beyond what is shown in pathways.
- The LIGAND database consisting of chemical compounds, carbohydrates, reactions, and enzyme information.
KEGG aims to comprehensively capture biological knowledge through integrated databases covering genomes, pathways, diseases and drugs.
Dot plots are a graphical method for assessing similarity between two sequences. A dot plot is created by making a matrix of one sequence against the other and coloring in cells with identical letters. Regions of local similarity appear as diagonal lines of colored dots. The document discusses how to create dot plots between DNA and protein sequences and explains how using a sliding window threshold can filter out random matches. Pros and cons of dot plots are provided along with examples of software that can be used to generate dot plots.
TrEMBL is a computer-annotated protein sequence database created by Rolf Apweiler that contains translations of coding sequences from nucleotide databases like EMBL and GenBank as well as protein sequences from literature or submitted directly. The database provides automated classification and annotation to enrich the protein sequences.
This is based on protein-ligand interaction physical method, which gives us knowledge about how our body protein interacts with other molecule and protein function.
Catalytic antibodies (abzymes) are monoclonal antibodies that exhibit enzymatic activity. They are produced by immunizing animals with transition state analogs that mimic the intermediate of chemical reactions. Abzymes function like enzymes by binding and stabilizing the transition state, lowering the activation energy of reactions and catalyzing them. Potential applications of abzymes include treating cancer, HIV, drug detoxification, controlling obesity, and targeting unwanted protein-protein interactions. One example is an abzyme that catalytically destroys the CD4 binding site on HIV, rendering the virus inert.
In this presentation i have explained about all the super secondary structure their types and their functions . The ppt has been made in such a way that it will clear out our basic concepts first and then it will go higher. I hope you like it
The European Molecular Biology Laboratory (EMBL) is a molecular biology research institution supported by 22 member states. EMBL was created in 1974 and operates from five sites, performing basic research in molecular biology and molecular medicine. A key function of EMBL is the EMBL Nucleotide Sequence Database, maintained at the European Bioinformatics Institute, which incorporates and distributes nucleotide sequences from public sources as part of an international collaboration.
Data mining involves using machine learning and statistical methods to discover patterns in large datasets and is useful in bioinformatics for analyzing biological data. Bioinformatics analyzes data from sequences, molecules, gene expressions, and pathways. Data mining can help understand these rapidly growing biological datasets. Common data mining tools in bioinformatics include BLAST for sequence comparisons, Entrez for integrated database searching, and ORF Finder for identifying open reading frames. Data mining approaches are well-suited to the enormous volumes of data in bioinformatics databases.
Dynamic programming is used for sequence alignment and other bioinformatics tasks. It works by breaking problems down into smaller subproblems. Needleman-Wunsch introduced an algorithm for global sequence alignment using dynamic programming that maximizes matches between sequences. The algorithm involves initializing a matrix, filling it using scoring schemes, and backtracking to trace alignments. Local alignment follows a similar approach but replaces negative values in the matrix with zeros to restrict alignments.
1) Pairwise sequence alignment is a method to compare two biological sequences like DNA, RNA, or proteins. It involves arranging the sequences in columns to highlight their similarities and differences.
2) There are many possible alignments between two sequences, but most imply too many mutations. The best alignment minimizes the number of mutations needed to explain the differences between the sequences.
3) For short protein sequences like "QKGSYPVRSTC" and "QKGSGPVRSTC", the optimal alignment implies one single mutation occurred since the sequences diverged from a common ancestor.
The document discusses various supersecondary structures of proteins, which are intermediate structures between secondary and tertiary protein structures. It describes several common motifs composed of two or more secondary structures, such as helix-turn-helix, helix-loop-helix, beta-hairpins, and the Rossmann fold. These motifs are building blocks that occur frequently in protein structures and are associated with specific functions like DNA binding. The document provides detailed examples and diagrams of different supersecondary structure motifs involving helices, strands, and their combinations.
This presentation gives you a detailed information about the swiss prot database that comes under UniProtKB. It also covers TrEMBL: a computer annotated supplement to Swiss-Prot.
This document discusses bacteriophages and their use in phage display. Specifically, it notes that bacteriophages infect bacterial cells and use them to replicate viruses. It then explains that phage display involves fusing foreign genes or proteins to the surface of phages, creating libraries of phages that each display a single protein. These libraries can be exposed to targets, and phages that interact are selected and amplified through multiple rounds. The document outlines several applications of proteins isolated through phage display, such as epitope mapping, drug discovery, and developing new vaccines or treatments that have a specific interaction with a target antigen, protein, or disease.
This document provides an overview and instructions for a course in Python programming. It discusses the recommended course literature, including Learning Python and Python in a Nutshell books. It also describes using the IDLE integrated development environment for writing and running Python code on Windows and Unix systems. The document then begins covering basic Python concepts like variables, data types, strings, lists, dictionaries and objects.
The document provides an introduction to the Python programming language, outlining its key features such as being dynamically typed, object oriented, scalable, extensible, portable, and readable. It describes Python's syntax differences from C-style languages and covers basic Python concepts like variables, data types, operators, strings, comments, control flow, functions, modules and packages. The document is intended to help new Python programmers get an overview of the language.
Protein-protein interactions are important for many biological processes. There are various types of interactions depending on their composition and duration. Methods to study interactions include yeast two-hybrid, co-immunoprecipitation, affinity chromatography, and chromatin immunoprecipitation. Databases such as IntAct and MINT provide repositories for protein interaction data.
Protein-DNA interactions can be either specific or non-specific. Specific interactions involve transcription factors that regulate gene expression by binding to DNA motifs through domains like helix-loop-helix, leucine zipper, or zinc finger motifs. Non-specific interactions involve histones that help structure DNA into nucleosomes within chromatin and can be chemically modified through methylation, demethylation, acetylation, and phosphorylation.
This document discusses Biopython, a Python package for biological data analysis. It provides concise summaries of key Biopython concepts:
1) Biopython is an object-oriented Python package that consists of modules for common biological data operations like working with sequences.
2) Key Biopython classes include Alphabet for sequence alphabets, Seq for representing sequences, SeqRecord for sequences with metadata, and SeqIO for reading/writing sequences to files.
3) Classes specify attributes (data) and methods (functions) that objects can have. For example, Seq objects have attributes like sequence and alphabet, and methods like translate() and complement().
The document discusses protein-protein interactions (PPIs) and methods used to study them. It defines PPIs as physical contacts between two or more proteins through biochemical or electrostatic forces. It describes different types of PPIs including homo-oligomers, hetero-oligomers, covalent and non-covalent interactions. Common methods to study PPIs are also summarized, such as yeast two-hybrid systems, co-immunoprecipitation, and protein interaction databases. The applications and importance of PPI research are mentioned including roles in various cellular processes and diseases.
After sequencing of the genome has been done, the first thing that comes to mind is "Where are the genes?". Genome annotation is the process of attaching information to the biological sequences. It is an active area of research and it would help scientists a lot to undergo with their wet lab projects once they know the coding parts of a genome.
The document describes several key databases within the KEGG resource, including:
- The PATHWAY database containing molecular network maps of metabolic and genetic pathways.
- The BRITE database providing hierarchical classifications of biological systems beyond what is shown in pathways.
- The LIGAND database consisting of chemical compounds, carbohydrates, reactions, and enzyme information.
KEGG aims to comprehensively capture biological knowledge through integrated databases covering genomes, pathways, diseases and drugs.
Dot plots are a graphical method for assessing similarity between two sequences. A dot plot is created by making a matrix of one sequence against the other and coloring in cells with identical letters. Regions of local similarity appear as diagonal lines of colored dots. The document discusses how to create dot plots between DNA and protein sequences and explains how using a sliding window threshold can filter out random matches. Pros and cons of dot plots are provided along with examples of software that can be used to generate dot plots.
TrEMBL is a computer-annotated protein sequence database created by Rolf Apweiler that contains translations of coding sequences from nucleotide databases like EMBL and GenBank as well as protein sequences from literature or submitted directly. The database provides automated classification and annotation to enrich the protein sequences.
This is based on protein-ligand interaction physical method, which gives us knowledge about how our body protein interacts with other molecule and protein function.
Catalytic antibodies (abzymes) are monoclonal antibodies that exhibit enzymatic activity. They are produced by immunizing animals with transition state analogs that mimic the intermediate of chemical reactions. Abzymes function like enzymes by binding and stabilizing the transition state, lowering the activation energy of reactions and catalyzing them. Potential applications of abzymes include treating cancer, HIV, drug detoxification, controlling obesity, and targeting unwanted protein-protein interactions. One example is an abzyme that catalytically destroys the CD4 binding site on HIV, rendering the virus inert.
In this presentation i have explained about all the super secondary structure their types and their functions . The ppt has been made in such a way that it will clear out our basic concepts first and then it will go higher. I hope you like it
The European Molecular Biology Laboratory (EMBL) is a molecular biology research institution supported by 22 member states. EMBL was created in 1974 and operates from five sites, performing basic research in molecular biology and molecular medicine. A key function of EMBL is the EMBL Nucleotide Sequence Database, maintained at the European Bioinformatics Institute, which incorporates and distributes nucleotide sequences from public sources as part of an international collaboration.
Data mining involves using machine learning and statistical methods to discover patterns in large datasets and is useful in bioinformatics for analyzing biological data. Bioinformatics analyzes data from sequences, molecules, gene expressions, and pathways. Data mining can help understand these rapidly growing biological datasets. Common data mining tools in bioinformatics include BLAST for sequence comparisons, Entrez for integrated database searching, and ORF Finder for identifying open reading frames. Data mining approaches are well-suited to the enormous volumes of data in bioinformatics databases.
Dynamic programming is used for sequence alignment and other bioinformatics tasks. It works by breaking problems down into smaller subproblems. Needleman-Wunsch introduced an algorithm for global sequence alignment using dynamic programming that maximizes matches between sequences. The algorithm involves initializing a matrix, filling it using scoring schemes, and backtracking to trace alignments. Local alignment follows a similar approach but replaces negative values in the matrix with zeros to restrict alignments.
1) Pairwise sequence alignment is a method to compare two biological sequences like DNA, RNA, or proteins. It involves arranging the sequences in columns to highlight their similarities and differences.
2) There are many possible alignments between two sequences, but most imply too many mutations. The best alignment minimizes the number of mutations needed to explain the differences between the sequences.
3) For short protein sequences like "QKGSYPVRSTC" and "QKGSGPVRSTC", the optimal alignment implies one single mutation occurred since the sequences diverged from a common ancestor.
The document discusses various supersecondary structures of proteins, which are intermediate structures between secondary and tertiary protein structures. It describes several common motifs composed of two or more secondary structures, such as helix-turn-helix, helix-loop-helix, beta-hairpins, and the Rossmann fold. These motifs are building blocks that occur frequently in protein structures and are associated with specific functions like DNA binding. The document provides detailed examples and diagrams of different supersecondary structure motifs involving helices, strands, and their combinations.
This presentation gives you a detailed information about the swiss prot database that comes under UniProtKB. It also covers TrEMBL: a computer annotated supplement to Swiss-Prot.
This document discusses bacteriophages and their use in phage display. Specifically, it notes that bacteriophages infect bacterial cells and use them to replicate viruses. It then explains that phage display involves fusing foreign genes or proteins to the surface of phages, creating libraries of phages that each display a single protein. These libraries can be exposed to targets, and phages that interact are selected and amplified through multiple rounds. The document outlines several applications of proteins isolated through phage display, such as epitope mapping, drug discovery, and developing new vaccines or treatments that have a specific interaction with a target antigen, protein, or disease.
This document provides an overview and instructions for a course in Python programming. It discusses the recommended course literature, including Learning Python and Python in a Nutshell books. It also describes using the IDLE integrated development environment for writing and running Python code on Windows and Unix systems. The document then begins covering basic Python concepts like variables, data types, strings, lists, dictionaries and objects.
The document provides an introduction to the Python programming language, outlining its key features such as being dynamically typed, object oriented, scalable, extensible, portable, and readable. It describes Python's syntax differences from C-style languages and covers basic Python concepts like variables, data types, operators, strings, comments, control flow, functions, modules and packages. The document is intended to help new Python programmers get an overview of the language.
Python is an interpreted, object-oriented programming language similar to PERL, that has gained popularity because of its clear syntax and readability.
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together.
Python is an interpreted, object-oriented programming language similar to PERL, that has gained popularity because of its clear syntax and readability.
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together.
This Presentation is a draft of a summary of "Learn Python The Hard Way" Book which is very helpful for anyone want to learn python from scratch of
For reading the book and do exercises, the book is available for free here: http://learnpythonthehardway.org/book/
This tutorial provides an introduction to the Python programming language. It will cover Python's core features like syntax, data types, operators, conditional and loop execution, functions, modules and packages to enable writing basic programs. The tutorial is intended for learners to learn Python together through questions, discussions and pointing out mistakes.
Python is a multi-paradigm programming language that is object-oriented, imperative and functional. It is dynamically typed, with support for complex data types like lists and strings. Python code is commonly written and executed using the interactive development environment IDLE.
Python is a multi-paradigm programming language that supports object-oriented, imperative and functional programming styles. It is dynamically typed and supports complex data types like lists, dictionaries and objects. Some key features of Python include being highly readable, having extensive libraries, and being cross-platform.
Python is a multi-paradigm programming language that is object-oriented, imperative and functional. It is dynamically typed, with support for complex data types like lists and strings. Python code is commonly written and executed using the interactive development environment IDLE.
Python is a multi-paradigm programming language that supports object-oriented, imperative and functional programming styles. It is dynamically typed and supports complex data types like lists, dictionaries and objects. Some key features of Python include being highly readable, having extensive libraries, and being cross-platform.
Python is a multi-paradigm programming language that is object-oriented, imperative and functional. It is dynamically typed, with support for complex data types like lists and strings. Python code is commonly written and executed using the interactive development environment IDLE.
Python is a multi-paradigm programming language that is object-oriented, imperative and functional. It is dynamically typed, with support for complex data types like lists and strings. Python code is commonly written and executed using the interactive development environment IDLE.
Python is a multi-paradigm programming language that is object-oriented, imperative and functional. It is an interpreted language with dynamic typing, automatic memory management and many useful features including a large standard library. Python code can be written and executed using the interactive IDE named IDLE.
This document provides an introduction and overview of the Python programming language. It discusses what Python is, why to learn a scripting language and why Python specifically. It covers how to install Python and how to edit Python code using IDLE. The rest of the document demonstrates various Python concepts like variables, data types, operators, flow control statements, functions and more through sample code examples. Each code example is accompanied by instructions to run the sample script in IDLE.
This document provides an agenda and overview for a Python training course. The agenda covers key Python topics like dictionaries, conditional statements, loops, functions, modules, input/output, error handling, object-oriented programming and more. The introduction section explains that Python is an interpreted, interactive and object-oriented language well-suited for beginners. It also outlines features like rapid development, automatic memory management and support for procedural and object-oriented programming. The document concludes by explaining Python's core data types including numbers, strings, lists, tuples and dictionaries.
This document provides an introduction to Python including:
- The major versions of Python and their differences
- Popular integrated development environments for Python
- How to set up Python environments using Anaconda and Eclipse
- An overview of Python basics like variables, numbers, strings, lists, dictionaries, modules and functions
- Examples of Python control flow structures like conditionals and loops
The document provides an introduction to Python programming, including details about Python's history, versions, data types, strings, and code execution. It discusses how to install Python and write basic programs. Key reasons for using Python are its object-oriented nature, readability, large standard library, cross-platform capabilities, and ease of use. The document also covers string methods and slicing, numeric data types, installing Python, and running code in interactive and script modes.
This document provides an introduction to the Python programming language. It discusses what Python is, its history and creator, why it is popular, who uses it, and how to get started with the syntax. Key topics covered include Python's readability, dynamic typing, standard library, and use across many industries. The document also includes code examples demonstrating basic Python concepts like variables, strings, control flow, functions, and file input/output.
- Python is an interpreted, object-oriented programming language that is beginner friendly and open source. It was created in the 1990s and named after Monty Python.
- Python is very suitable for natural language processing tasks due to its built-in string and list datatypes as well as libraries like NLTK. It also has strong numeric processing capabilities useful for machine learning.
- Python code is organized using functions, classes, modules, and packages to improve structure. It is interpreted at runtime rather than requiring a separate compilation step.
Biological application of spectroscopy.pptxRahulRajai
Spectroscopy in biological studies involves using light or other forms of electromagnetic radiation to analyze the structure, function, and interactions of biological molecules. It helps researchers understand how molecules like proteins, nucleic acids, and lipids behave and interact within cells.
biological applications of spectroscopy:
1. Studying Biological Molecules:
Proteins:
Spectroscopy can reveal protein structure, including folding patterns and interactions with other molecules.
Nucleic Acids:
It helps analyze the structure of DNA and RNA, including their base sequences and interactions.
Lipids:
Spectroscopy can be used to study lipid interactions within cell membranes and their role in cellular processes.
Metabolic Pathways:
Spectroscopy can monitor changes in metabolic processes and cellular signaling pathways, providing insights into how cells function.
A review on simple heterocyclics involved in chemical ,biochemical and metabo...DrAparnaYeddala
Heterocyclics play crucial role in the drug discovery process and exhibit various
biological activities. Among aromatic heterocycles, the prevalent moieties are five membered
rings.The role and utility of heterocycles in organic synthesis paved the way to develop
precursors for aminoacids, medicinaldrugs and other chemical componetnts.For an organic
molecule the potency is measured based on its non toxic nature, lower dosage and inhibition
of microbial cellwall growth.
Also for evaluating their potential to be used as drugs, pharmaceuticals, special
chemicals and agrochemicals.
Heterocyclic chemistry credits for nearly thirty percent of contemporary
publications. In fact seventy five percent of organic compounds are heterocyclic compounds.
The alkaloids with nitrogen atoms like ergotamine show antimigraine activity, cinchonine,
and display antimalarial activity. The loaded activity of these compounds was explored by
many researchers in medicinal, insecticidal, pesticidal and naturally occurring aminoacids.
Nucleic acid strands contain heterocylic compounds as major components. Also they display
their major role as central nervous system activators, insecticidal, pesticidal and physiological
processes like antiinflammation activity and antitumor activity.
Compound Microscope with working principleRahulRajai
A compound microscope is a type of optical microscope that uses two or more lenses to magnify a specimen. It achieves this by first magnifying the image using the objective lens, and then further magnifying that image using the eyepiece lens. This two-step magnification process allows for detailed observation of small objects,
Analytical techniques in dry chemistry for heavy metal analysis and recent ad...Archana Verma
Heavy Metals is often used as a group name for metals and semimetals (metalloids) that have been associated with contamination and potential toxicity (Duffus, 2001). Heavy metals inhibit various enzymes and compete with various essential cations (Tchounwou et al., 2012). These may cause toxic effects (some of them at a very low content level) if they occur excessively, because of this the assessment to know their extent of contamination in soil becomes very important. Analytical techniques of dry chemistry are non-destructive and rapid and due to that a huge amount of soil samples can be analysed to know extent of heavy metal pollution, which conventional way of analysis not provide efficiently because of being tedious processes. Compared with conventional analytical methods, Vis-NIR techniques provide spectrally rich and spatially continuous information to obtain soil physical and chemical contamination. Among the calibration methods, a number of multivariate regression techniques for assessing heavy metal contamination have been employed by many studies effectively (Costa et al.,2020). X-ray fluorescence spectrometry has several advantages when compared to other multi-elemental techniques such as inductively coupled plasma mass spectrometry (ICP-MS). The main advantages of XRF analysis are; the limited preparation required for solid samples and the decreased production of hazardous waste. Field portable (FP)-XRF retains these advantages while additionally providing data on-site and hence reducing costs associated with sample transport and storage (Pearson et al.,2013). Laser Induced Breakdown Spectroscopy (LIBS) is a kind of atomic emission spectroscopy. In LIBS technology, a laser pulse is focused precisely onto the surface of a target sample, ablating a certain amount of sample to create plasma (Vincenzo Palleschi,2020). After obtaining the LIBS data of the tested sample, qualitative and quantitative analysis is conducted. Even after being rapid and non-destructive, several limitations are also there in these advance techniques such as more effective and accurate quantification models are needed. To overcome these problems, proper calibration models should be developed for better quantification of spectrum in near future.
This presentation provides a concise overview of the human immune system's fundamental response to viral infections. It covers both innate and adaptive immune mechanisms, detailing the roles of physical barriers, interferons, natural killer (NK) cells, antigen-presenting cells (APCs), B cells, and T cells in combating viruses. Designed for students, educators, and anyone interested in immunology, this slide deck simplifies complex biological processes and highlights key steps in viral detection, immune activation, and memory formation. Ideal for classroom use or self-learning.
Towards Scientific Foundation Models (Invited Talk)Steffen Staab
Foundation models are machine-learned models that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. Foundation models have been used successfully for question answering and text generation (ChatGPT), image understanding (Clip, VIT), or image generation. Recently, the basic idea underlying foundation models been considered for learning scientific foundation models that capture expectations about partial differential equations. Existing scientific foundation models have still been very much limited wrt. the type of PDEs or differential operators . In this talk, I present some of our recent work on paving the way towards scientific foundation models that aims at making them more robust and better generalisable.
2. Scripting languages
• Scripting languages are a type of
programming language that are interpreted
instead of compiled.
• They are generally considered high-level and
are usually easier to read and learn.
• Examples:
• Bash (shell scripting)
• R (statistical scripting)
• Perl (general-purpose scripting)
• Python (general-purpose scripting)
3. • A popular, open-source, multi-platform,
general-purpose scripting language.
• Many extensions and libraries for scientific
computing.
• Current supported versions: 2.7 and 3.5.
Install Python on your computer!
• Official Python distribution:
https://www.python.org/downloads/
• Jupyter (formerly iPython):
https://www.continuum.io/downloads
4. Learning Goals
1. Understand strings to print and manipulate text
2. Use the open() function to read and write files
3. Understand lists and use loops to go through them
4. Create your own functions
5. Use conditional tests to add more functionality to
scripts
5. Leaky pipes - A formatting problem
Blergh… All my files are messed up!
They are in the wrong format!
The program I want to use won’t open them!
⎯ Frustrated bioinformatician
• We often require code to parse the output of
one program and produce another file as input
for a specific software.
Parse:
To analyze a text to extract useful information from it.
7. Handling text in Python
Printing text to the terminal:
>>> print(“Hello world”)
8. Handling text in Python
Printing text to the terminal:
>>> print(“Hello world”)
• Python interpreter prompt: >>>
9. Handling text in Python
Printing text to the terminal:
>>> print(“Hello world”)
• Python interpreter prompt: >>>
• Input: print(“Hello world”)
10. Handling text in Python
Printing text to the terminal:
>>> print(“Hello world”)
• Python interpreter prompt: >>>
• Input: print(“Hello world”)
• Function: print()
11. Handling text in Python
Printing text to the terminal:
>>> print(“Hello world”)
• Python interpreter prompt: >>>
• Input: print(“Hello world”)
• Function: print()
• Argument: “Hello world”
12. Handling text in Python
Printing text to the terminal:
>>> print(“Hello world”)
Hello world
• Python interpreter prompt: >>>
• Input: print(“Hello world”)
• Function: print()
• Argument: “Hello world”
• Output: Hello world
13. Handling text in Python
Printing text to the terminal:
>>> print(“Hello world”)
Hello world
• Python interpreter prompt: >>>
• Input: print(“Hello world”)
• Function: print()
• Argument: “Hello world”
• Output: Hello world
14. Handling text in Python
What happens if we use single quotes?
>>> print(‘Hello world’)
Hello world
We get the same result!!!
• In Python single quotes ‘’ and double
quotes “” are interchangeable.
But, don’t mix them!
15. Handling text in Python
What happens if we mix quotes?
>>> print(‘Hello world”)
File "<stdin>", line 1
print('Hello world")
^
SyntaxError: EOL while scanning single-
quoted string
Whoops!
16. Handling text in Python
Error messages give us important clues:
>>> print(‘Hello world”)
File "<stdin>", line 1
print('Hello world")
^
SyntaxError: EOL while scanning single-
quoted string
17. Handling text in Python
Error messages give us important clues:
>>> print(‘Hello world”)
File "<stdin>", line 1
print('Hello world")
^
SyntaxError: EOL while scanning single-
quoted string
• File and line containing error.
18. Handling text in Python
Error messages give us important clues:
>>> print(‘Hello world”)
File "<stdin>", line 1
print('Hello world")
^
SyntaxError: EOL while scanning single-
quoted string
• File and line containing error.
• Best guess as to where error is found.
19. Handling text in Python
Error messages give us important clues:
>>> print(‘Hello world”)
File "<stdin>", line 1
print('Hello world")
^
SyntaxError: EOL while scanning single-
quoted string
• File and line containing error.
• Best guess as to where error is found.
• Error type and explanation.
20. Handling text in Python
We can save strings as variables:
>>> #My first variable!
>>> dna_seq1 = “ATGTGA”
21. Handling text in Python
We can save strings as variables:
>>> #My first variable!
>>> dna_seq1 = “ATGTGA”
• A line starting with # is a comment.
22. Handling text in Python
We can save strings as variables:
>>> #My first variable!
>>> dna_seq1 = “ATGTGA”
• A line starting with # is a comment.
• We use the = symbol to assign a variable.
• We can re-assign variables as many times
as we want.
That’s why they’re called variables !
23. Handling text in Python
We can save strings as variables:
>>> #My first variable!
>>> dna_seq1 = “ATGTGA”
>>> dna_seq1 = “ATGTAA”
• A line starting with # is a comment.
• We use the = symbol to assign a variable.
• We can re-assign variables as many times
as we want.
That’s why they’re called variables !
24. Handling text in Python
We can save strings as variables:
>>> print(dna_seq1)
ATGTAA
• Once assigned, the we can use the
variable name instead of its content.
• Variable names can have letters, numbers,
and underscores.
• They can’t start with numbers.
• They are case-sensitive.
Name your variables carefully!
25. Handling text in Python
Any value between quotes is called a string:
>>> type(dna_seq1)
<type ‘str’>
• Strings (‘str’) are a type of object.
• Other types include integers (‘int’),
floats (‘float’), lists (‘list’), etc…
• Strings are mainly used to manipulate text
within Python.
Understanding how to use strings is crucial
for bioinformatics!
26. String operations
Concatenation
>>> start_codon = ‘ATG’
>>> stop_codon = ‘TGA’
>>> coding_seq = ‘CATATT’
>>> full_seq = start_codon + coding_seq
... + stop_codon
>>> print(full_seq)
ATGCATATTTGA
• To combine strings, we use the + operator
27. String operations
String length
>>> len(full_seq)
>>>
>>> #len() produces no output
>>> full_lenght = len(full_seq)
>>> print(full_length)
12
>>> type(full_length)
<type ‘int’>
• To find the lenght of a string we can use
the len() function.
• Its return value is an integer (number).
28. String operations
Turning objects into strings
>>> print(“The length of our seq is ”
... + full_length)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot concatenate 'str' and
'int' objects
• It is not possible to concatenate objects of
different types.
29. String operations
Turning objects into strings
>>> print(“The length of our seq is ”
... + str(full_length))
The length of our seq is 12
• The str() function turns any object into a
string.
30. String operations
Substrings
>>> #Let’s print only the coding sequence
>>> print(full_seq[3:9])
CATATT
• To understand how we did it we need to
know how strings are numbered:
A T G C A T A T T T G A
0 1 2 3 4 5 6 7 8 9 10 11
Python always starts counting from zero!!!
31. String operations
Substrings
>>> #Let’s print only the coding sequence
>>> print(full_seq[3:9])
CATATT
• How to create a substring:
A T G C A T A T T T G A
0 1 2 3 4 5 6 7 8 9 10 11
32. String operations
Substrings
>>> #Let’s print only the coding sequence
>>> print(full_seq[3:9])
CATATT
• How to create a substring:
A T G |C A T A T T T G A
0 1 2 [3 4 5 6 7 8 9 10 11
The first number is included (start inclusive).
33. String operations
Substrings
>>> #Let’s print only the coding sequence
>>> print(full_seq[3:9])
CATATT
• How to create a substring:
A T G |C A T A T T |T G A
0 1 2 [3 4 5 6 7 8 ]9 10 11
The first number is included (start inclusive).
The second number is excluded (end exclusive).
34. String operations
Substrings
>>> #Let’s print only the coding sequence
>>> print(full_seq[3:9])
CATATT
• How to create a substring:
A T G |C A T A T T |T G A
0 1 2 [3 4 5 6 7 8 ]9 10 11
The first number is included (start inclusive).
The second number is excluded (end exclusive).
35. String operations
Substrings
>>> #We can also print just one letter
>>> print(full_seq[11])
A
• Each character in the string can be called
using their postion (index) number:
A T G C A T A T T T G A
0 1 2 3 4 5 6 7 8 9 10 11
36. String operations
Methods
>>> lower_seq = full_seq.lower()
>>> print(lower_seq)
atgcatatttga
• A method is similar to a function, but it is
associated to a specific object type.
• We call them after a variable of the right type,
using a ‘.’ (period) to separate them.
• In this case, the method .lower() is called
on strings to convert all uppercase
characters into lowercase.
38. Opening files
The open() function is used to open files:
>>> my_file = open(“BV164695.1.seq”,”r”)
>>> print(my_file)
<open file ‘BV164695.1.seq', mode 'r' at
0x109de84b0>
• It returns a file object.
• This object is different from other types of
objects.
• We rarely interact with it directly.
• We mostly interact with it through
methods.
39. Opening files
The open() function is used to open files:
>>> my_file = open(“BV164695.1.seq”,”r”)
• The first argument is the path to the file.
• This path should be relative to our working
directory.*
• The second argument is the mode in which
we are opening the file.
• We separate arguments using a comma.
Don’t forget the quotes!
40. Opening files
Files can be opened in three modes:
• Read ( “r” ): Permits access to the content
of a file, but can’t modify it (default).
• Write ( “w” ): Enables the user overwrite the
contents of a file.
• Append ( “a” ): Enables the user to add
content to a file, without erasing previous
content.
Don’t confuse write and append,
you could lose a lot of data!
41. Opening files
The .read() method extracts file content:
>>> my_file = open(“BV164695.1.seq”,”r”)
>>> file_content = my_file.read()
>>> print(type(my_file),
... type(file_content))
(<type 'file'>, <type 'str'>)
• Returns the full contents of a file as a string.
• Takes no arguments.
Remember: The .read() method can
only be used on file objects in read mode!
42. Opening files
The .write() method writes content into file:
>>> out_file = open(“test_out.txt”,”w”)
>>> hello_world = “Hello world!”
>>> out_file.write(hello_world)
• Writes content into file objects in “w” or “a”
modes.
• Argument must be a string.
The .write() method can
only be used on file objects in write or append mode!
43. Closing files
The .close() method flushes a file:
>>> print(out_file)
<open file ’test_out.txt', mode ’w' at 0x
103f53540>
>>> out_file.close()
>>> print(out_file)
<closed file ’test_out.txt', mode ’w' at
0x103f53540>
• Flushing files saves the changes and lets
other programs use it.
It is always good practice to close files after using them!
45. Using lists
A list is an object containing several elements:
>>> nucleic_ac = [“DNA”,”mRNA”,”tRNA”]
>>> print(type(nucleic_ac))
<type 'list'>
• A list is created using brackets [ ].
• The elements are separated by commas.
• List elements can be of any object type.
46. Using lists
It is possible to mix object types within lists:
>>> number_one = [“one”, 1, 1.0]
>>> numbers_123 = [[“one”, 1, 1.0],
... [“two”, 2, 2.0],[“three”, 3, 3.0]]
We can even make lists of lists!
47. Using lists
Elements are called using their index:
>>> number_one = [“one”, 1, 1.0]
>>> numbers_123 = [[“one”, 1, 1.0],
... [“two”, 2, 2.0],[“three”, 3, 3.0]]
>>> print(number_one[1],
... type(number_one[1]))
(1, <type 'int'>)
Don’t forget to start counting from zero!
48. Using lists
Elements are called using their index:
>>> number_one = [“one”, 1, 1.0]
>>> numbers_123 = [[“one”, 1, 1.0],
... [“two”, 2, 2.0],[“three”, 3, 3.0]]
>>> print(number_one[2],
... type(number_one[2]))
(1.0, <type ’float'>)
49. Using lists
Elements are called using their index:
>>> number_one = [“one”, 1, 1.0]
>>> numbers_123 = [[“one”, 1, 1.0],
... [“two”, 2, 2.0],[“three”, 3, 3.0]]
>>> print(numbers_123[0],
... type(numbers_123[0]))
(['one', 1, 1.0], <type 'list'>)
50. Using lists
Elements can be substituted using their index:
>>> numbers_123 = [[“one”, 1, 1.0],
... [“two”, 2, 2.0],[“three”, 3, 3.0]]
>>> numbers_123[0] = [“zero”, 0, 0.0]
>>> print(numbers_123)
[['zero', 0, 0.0], ['two', 2, 2.0],
['three', 3, 3.0]]
51. Using lists
The .append() method adds elements to lists:
>>> number_one = [“one”, 1, 1.0]
>>> number_one.append(“I”)
>>> print(number_one)
['one', 1, 1.0, 'I']
• Takes only one of argument.
• Doesn’t return anything, it modifies the
actual list.
• It only adds an element to the end of a list.
52. Using lists
Sublists can also be created using indices:
>>> number_one = [“one”, 1, 1.0,”I”]
>>> number_1 = number_one[1:3]
>>> print(number_1, type(number_1))
([1, 1.0], <type 'list'>)
• Work similar to strings (first inclusive,
last exclusive).
53. Using loops
Loops make it easier to act on list elements:
>>> nucleic_ac = [“DNA”,“mRNA”,“tRNA”]
>>> for string in nucleic_ac:
... print(string + “ is a nucleic acid”)
...
DNA is a nucleic acid
mRNA is a nucleic acid
tRNA is a nucleic acid
54. Using loops
Loops have the following structure:
>>> nucleic_ac = [“DNA”,“mRNA”,“tRNA”]
>>> for string in nucleic_ac:
... print(string + “ is a nucleic acid”)
...
DNA is a nucleic acid
mRNA is a nucleic acid
tRNA is a nucleic acid
• Loop statement:
for ____ in ____ :
Don’t forget the colon!
55. Using loops
Loops have the following structure:
>>> nucleic_ac = [“DNA”,“mRNA”,“tRNA”]
>>> for string in nucleic_ac:
... print(string + “ is a nucleic acid”)
...
DNA is a nucleic acid
mRNA is a nucleic acid
tRNA is a nucleic acid
• Element name
• Same rules as variable naming.
This variable only exists inside the loop!
56. Using loops
Loops have the following structure:
>>> nucleic_ac = [“DNA”,“mRNA”,“tRNA”]
>>> for acid in nucleic_ac:
... print(acid + “ is a nucleic acid”)
...
DNA is a nucleic acid
mRNA is a nucleic acid
tRNA is a nucleic acid
• Element name
• Same rules as variable naming.
Chose appropriate names to avoid confusion.
57. Using loops
Loops have the following structure:
>>> nucleic_ac = [“DNA”,“mRNA”,“tRNA”]
>>> for acid in nucleic_ac:
... print(acid + “ is a nucleic acid”)
...
DNA is a nucleic acid
mRNA is a nucleic acid
tRNA is a nucleic acid
• Iterable object
• The loop elements will depend on the
type of object.
58. Using loops
Some basic iterable object types:
Object type Iterable element
List List element
String Individual characters
Open file in ‘r’ mode Individual line in the file
Dictionary Values (in arbitrary order)
Set Set element (in arbitrary order)
The variety of iterable objects makes loops a
very powerful tool in python!
59. Using loops
Loops have the following structure:
>>> nucleic_ac = [“DNA”,“mRNA”,“tRNA”]
>>> for acid in nucleic_ac:
... print(acid + “ is a nucleic acid”)
...
DNA is a nucleic acid
mRNA is a nucleic acid
tRNA is a nucleic acid
• The body of the loop is defined with tabs.
• It can be as long as necessary, but all lines
must start with a tab.
60. Using loops
Loops have the following structure:
>>> nucleic_ac = [“DNA”,“mRNA”,“tRNA”]
>>> for acid in nucleic_ac:
... print(acid + “ is a nucleic acid”)
... print(“I like “ + acid)
...
DNA is a nucleic acid
I like DNA
mRNA is a nucleic acid
I like mRNA
tRNA is a nucleic acid
I like tRNA
64. Creating functions
>>> def gc_content(seq):
... length = len(seq)
... G_content = seq.count(“G”)
... C_content = seq.count(“C”)
... GC_content =(G_content + C_content)
... / float(length)
... return GC_content
...
• The function name
• Same naming rules as variables
Function definitions have this structure:
65. Creating functions
>>> def gc_content(seq):
... length = len(seq)
... G_content = seq.count(“G”)
... C_content = seq.count(“C”)
... GC_content =(G_content + C_content)
... / float(length)
... return GC_content
...
• The argument(s) of our function
• Same naming rules as variables
• This part is optional
Function definitions have this structure:
66. Creating functions
>>> def gc_content(seq):
... length = len(seq)
... G_content = seq.count(“G”)
... C_content = seq.count(“C”)
... GC_content =(G_content + C_content)
... / float(length)
... return GC_content
...
• The body of the function is defined by tabs
• It can be as long as necessary, but all lines
must start with a tab.
Function definitions have this structure:
67. Creating functions
>>> def gc_content(seq):
... length = len(seq)
... G_content = seq.count(“G”)
... C_content = seq.count(“C”)
... GC_content =(G_content + C_content)
... / float(length)
... return GC_content
...
• The return statement (optional)
• Can return one or more objects
• Marks the end of a function
Function definitions have this structure:
68. Calling functions
>>> test_seq = “ACTGATCGATCG”
>>> gc_test = gc_content(test_seq)
>>> print(gc_test, type(gc_test))
(0.5, <type 'float'>)
>>> print(GC_content)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'GC_content' is not defined
Once defined, we can call a function:
Variables within the function are not defined outside
of that function!
69. Other function options
>>> test_seq = “ACTGATCGATCG”
>>> print(gc_content(test_seq))
0.5
>>> test_seq = “ACTGATCGATCGC”
>>> print(gc_content(test_seq))
0.538461538462
Let’s improve our function:
I don’t want that many numbers!
70. Other function options
The round() function lets us round the result:
>>> def gc_content(seq):
... length = len(seq)
... G_content = seq.count(“G”)
... C_content = seq.count(“C”)
... GC_content =(G_content + C_content)
... / float(length)
... return round(GC_content,2)
...
>>> print(gc_content(test_seq))
0.54
71. Other function options
A second argument gives more flexibility:
>>> def gc_content(seq,sig_fig):
... length = len(seq)
... G_content = seq.count(“G”)
... C_content = seq.count(“C”)
... GC_content =(G_content + C_content)
... / float(length)
... return round(GC_content,sig_fig)
...
>>> print(gc_content(test_seq,2))
0.54
>>> print(gc_content(test_seq,3))
0.538
72. Other function options
We can call a function with keyword arguments:
>>> def gc_content(seq,sig_fig):
... length = len(seq)
... G_content = seq.count(“G”)
... C_content = seq.count(“C”)
... GC_content =(G_content + C_content)
... / float(length)
... return round(GC_content,sig_fig)
...
>>> print(gc_content(seq=‘ACGC’,sig_fig=1))
0.8
>>> print(gc_content(sig_fig=1,seq=‘ACGC’))
0.8
73. Other function options
We can give our functions default values:
>>> def gc_content(seq,sig_fig=2):
... length = len(seq)
... G_content = seq.count(“G”)
... C_content = seq.count(“C”)
... GC_content =(G_content + C_content)
... / float(length)
... return round(GC_content,sig_fig)
...
>>> print(gc_content(test_seq))
0.54
>>> print(gc_content(test_seq,sig_fig=3))
0.538
75. Conditions
Conditions are pieces of code that can only
produce one of two answers:
- True
- False
When required, python tests (or evaluates) the
condition and produces the result.
>>> print( 3 == 5 )
False
>>> print( 3 < 5 )
True
>>> print( 3 >= 5 )
False
These are not strings!
76. Conditions
The following symbols are used to construct
conditions:
Symbol Meaning
== Equals
> < Greater than, less than
>= <= Greater and less than, or equal to
!= Not equal
in Is a value in a list
is Are the same object*
Remember to use two equals signs
when writing conditions!
78. Conditional tests
An if statement only executes if the condition
evaluates as True:
>>> test_seq = ‘ATTGCATGGTATCTACGG’
>>> if len(test_seq) < 10:
... print(seq)
...
>>>
>>> test_seq = ‘ATTGCATGG’
>>> if len(test_seq) < 10:
... print(seq)
...
ATTGCATGG
• If statements have similar structure to loops
79. Conditional tests
An if statement only executes if the condition
evaluates as True:
>>> seq_list = [‘ATTGCATGGTATCTACGG’,
... ‘ATCGCA’,’ATTTTCA’,’ATTCATCGAT’]
>>> for seq in seq_list:
... if len(seq) < 10:
... print(seq)
...
ATCGCA
ATTTTCA
When nesting commands,
be careful with the tabs !
80. Conditional tests
An else statement only executes when the if
statement(s) preceding it evaluate as False:
>>> seq_list = [‘ATTGCATGGTATCTACGG’,
... ‘ATCGCA’,’ATTTTCA’,’ATTCATCGAT’]
>>> for seq in seq_list:
... if len(seq) < 10:
... print(seq)
... else:
... print(str(len(seq))+ ‘ base seq’)
...
18 base seq
ATCGCA
ATTTTCA
10 base seq
Remember: else statements
never have conditions!
81. Conditional tests
To create if/else blocks with multiple
conditions, we use elif statements:
>>> for seq in seq_list:
... if len(seq) < 10:
... print(seq)
... elif len(seq) == 10:
... print(seq[:5] + ‘...’)
... else:
... print(str(len(seq))+ ‘ base seq’)
...
18 base seq
ATCGCA
ATTTTCA
ATTCA...
82. Boolean operators
Boolean operators let us group several
conditions into a single one:
>>> seq_list = [‘ATTGCATGGTATCTACGG’,’AT’,
... ‘ATCGCA’,’ATTCATCGAT’]
>>> for seq in seq_list:
... if len(seq) < 3 or len(seq) > 15:
... print(str(len(seq))+ ‘ base seq’)
... else:
... print(seq)
...
18 base seq
2 base seq
ATCGCA
ATTCATCGAT
83. Boolean operators
There are three boolean operators in python:
Boolean operator Boolean operation Result
and
False and False False
True and True True
True and False False
or
False or False False
True or True True
True or False True
not
not True False
not False True
84. True/False functions
Functions can return True or False:
>>> def is_long(seq,min_len=10):
... if len(seq) > min_len:
... return True
... else:
... return False
...
>>> for seq in seq_list:
... if is_long(seq):
... print(‘Long sequence’)
... else:
... print(‘Short sequence’)
...
85. True/False functions
Functions can return True or False:
>>> for seq in seq_list:
... if is_long(seq):
... print(‘Long sequence’)
... else:
... print(‘Short sequence’)
...
Long sequence
Short sequence
Short sequence
Short sequence
86. True/False functions
Functions can return True or False:
>>> for seq in seq_list:
... if is_long(seq,5):
... print(‘Long sequence’)
... else:
... print(‘Short sequence’)
...
Long sequence
Short sequence
Long sequence
Long sequence
87. Conclusion
• Python is a very powerful language that is
currently used for many things:
• Bioinformatics tool development
• Pipeline deployment
• Big Data analysis
• Scientific computing
• Web development (Django)
The best way to learn to code
is through practice and
by reading other developers’ code!
88. References & Further Reading
• Official python documentation:
https://www.python.org/doc/
• “Python for Biologists” by Dr. Martin Jones
www.pythonforbiologists.com
• E-books with biological focus
• CodeSkulptor: http://www.codeskulptor.org/
• Codecademy python course:
https://www.codecademy.com/learn/python
• Jupyter project: http://jupyter.org/index.html