codingPracticesSSW

The document outlines best practices for scientific computing and software engineering, emphasizing the importance of coding as a critical skill in research. It covers various management levels, including code, data, directory, and project management, while promoting modular programming, documentation, and collaboration. The conclusion highlights the significance of adhering to software engineering practices to enhance efficiency and user-friendliness in software development.

Uploaded by

Rohit Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

codingPracticesSSW

Uploaded by

Rohit Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Numerical Software

Engineering 101/201
Scientific Software Club 2/13/17
Papers
● Best Practices for Scientific Computing, Wilson et al.
● Good Enough Practices in Scientific Computing, Wilson et al.
● Barely Sufficient Software Engineering: 10 Practices to Improve your CSE
Software, Heroux and Willenbring
Misconception: Coding is unimportant! It’s not like I’m a software
engineer...

(The crucial part is getting the numerical algorithm, proper data, good results, etc)
The (Relative) Truth: Coding is an important part of research and
a skill that takes years to hone

(Teach Yourself Programming in Ten Years by Peter Norvig)

Topics
● Code Level Management
● Data Management
● Directory Level Management
● Project Level Management
● Working with Others
● Documentation and Technical Writing
Code Level Management
Comment Succinctly (Design, Not Mechanism)
double AreaRectangle(double x, double y){
/* AreaRectangle calculates the area of a
rectangle with dimensions x and y */

/* Return -1 if bad input*/

if(x < 0 || y < 0){
printf(“x and y must be positive numbers”);
return -1;
}
/* Return the product of x and y */
return x*y;
}
Comment Succinctly
/*
runN4SID runs the system identification algorithm n4sid

~~~~INPUT~~~~
data: N x K time domain signal, N = number samples, K = dimension of data
p: includes measurement frequency in Hz, model size to fit
~~~~~~~~~~~~~

~~~OUTPUT~~~
Fitted system model, saved in results folder as system.csv
~~~~~~~~~~~~~
*/
void runN4SID(double data, params p){
…
}
Name Intelligently
● Fits in with earlier example, but having descriptive function and variables
is extremely important
● A headache for numerical calculations
○ Generally, code might be ugly, but make sure function is named well!
Name Variables Intelligently
void calcStuff(...){
A = getMatrix(...);
[U, D, V] = svd(A);
[X, Y] = getData(...);
[E, Z] = eig(X*A*Y);
w = getWeights...();
[S, N] = sumEV(W, w);
B = convolveMatrix(A, N, S)
I = [ identity(N); identity(N)];
C = I*B + I*A;
[Q, R, P] = qr(C);
….
(you get the point)
}
class Central2D { float& fx2(int ix, int iy) { return fx2_[offset(ix,iy)]; }
public: float& fx3(int ix, int iy) { return fx3_[offset(ix,iy)]; }
Central2D(float w, float h, // Domain width / height float& gy1(int ix, int iy) { return gy1_[offset(ix,iy)]; } // y differences of g
int nx, int ny, // Number of cells in x/y (without ghosts) float& gy2(int ix, int iy) { return gy2_[offset(ix,iy)]; }
float cfl = 0.45) : // Max allowed CFL number float& gy3(int ix, int iy) { return gy3_[offset(ix,iy)]; }
nx(nx), ny(ny), float& v1(int ix, int iy) {return v1_[offset(ix,iy)]; } // Solution values at next
nx_all(nx + 2*nghost), float& v2(int ix, int iy) {return v2_[offset(ix,iy)]; }
ny_all(ny + 2*nghost), float& v3(int ix, int iy) {return v3_[offset(ix,iy)]; }
dx(w/nx), dy(h/ny),
cfl(cfl) {} // Diagnostics
void solution_check();
static constexpr int nghost = 3; // Number of ghost cells // Array size accessors
const int nx, ny; // Number of (non-ghost) cells in x/y int xsize() const { return nx; }
const int nx_all, ny_all; // Total cells in x/y (including ghost) int ysize() const { return ny; }
const float dx, dy; // Cell size in x/y
const float cfl; // Allowed CFL number // Read / write elements of simulation state
// Array accessor functions float& operator()(int i, int j) {
int offset(int ix, int iy) const { return iy*nx_all+ix; } return u1_[offset(i,j)];
}
float& u1(int ix, int iy) { return u1_[offset(ix,iy)]; } // Solution values
float& u2(int ix, int iy) { return u2_[offset(ix,iy)]; } const float& operator()(int i, int j) const {
float& u3(int ix, int iy) { return u3_[offset(ix,iy)]; } return u1_[offset(i,j)];
float& f1(int ix, int iy) { return f1_[offset(ix,iy)]; } // Fluxes in x }
float& f2(int ix, int iy) { return f2_[offset(ix,iy)]; } // Wrapped accessor (periodic BC)
float& f3(int ix, int iy) { return f3_[offset(ix,iy)]; } int ioffset(int ix, int iy) {
float& g1(int ix, int iy) { return g1_[offset(ix,iy)]; } // Fluxes in y return offset( (ix+nx-nghost) % nx + nghost,
float& g2(int ix, int iy) { return g2_[offset(ix,iy)]; } (iy+ny-nghost) % ny + nghost );
float& g3(int ix, int iy) { return g3_[offset(ix,iy)]; } }
float& ux1(int ix, int iy) { return ux1_[offset(ix,iy)]; } // x differences of u
float& ux2(int ix, int iy) { return ux2_[offset(ix,iy)]; } float& uwrap1(int ix, int iy) { return u1_[ioffset(ix,iy)]; }
float& ux3(int ix, int iy) { return ux3_[offset(ix,iy)]; } float& uwrap2(int ix, int iy) { return u2_[ioffset(ix,iy)]; }
float& uy1(int ix, int iy) { return uy1_[offset(ix,iy)]; } // y differences of u float& uwrap3(int ix, int iy) { return u3_[ioffset(ix,iy)]; }
float& uy2(int ix, int iy) { return uy2_[offset(ix,iy)]; }
float& uy3(int ix, int iy) { return uy3_[offset(ix,iy)]; }
float& fx1(int ix, int iy) { return fx1_[offset(ix,iy)]; } // x differences of f void run(float tfinal);
// Call f(Uxy, x, y) at each cell center to set initial conditions
Decompose Programs into Functions
● Try to keep functions short
● Modularity makes code base more flexible, more easily modifiable
● Saves lines of code
● Practically speaking, humans can only remember a few things at a time!
Decomposing Programs into Functions
void calcStuff(...){ void calcStuff(...){
Node root; Node root;
… …
Node data; Node data;
… …
bool checkchild = 0; bool checkchild = isChild(root, data);
for(i = 0; i < root.numchildren; i++){ ...
if(root.child[i] == data){ }
checkchild = 1;
}
}
...
}
Eliminate Duplication
double calcValues(...){ double calcValues(... , bool Filter){
… …
X = getvalue(...); X = getvalue(...);
return X; if( Filter == true){
} X = filter(X);
VS }
double calcValuesFilter(...){ return x;
… }
X = getvalue(...);
X = filter(X);
return X;
}
Keep Semantics Consistent
void scaleVec(vec v, double n){ void scaleMatrix(double n, matrix m){
... ...
} }

void filterEigenVecs(Matrix M){ void filterEigVals(Matrix M){

... VS ...
} }

void find_all_keys(keys K){ void findAllKeyrings(rings R){

... ...
} }
Use Data Structures (If necessary)
void doStuff(... void doStuff(metatdata d){
double timestep, int size... …
date d, int dimx, int dimy… }
int numthreads){
... class metadata{
} VS double timestep;
int size;
date d;
int dimx;
int dimy;
int numthreads;
}
Incremental Changes
● Emphasized in two papers
○ Decompose a large task into small components
○ Test the correctness of components
● Programmers are most productive working in small steps
○ + Course Correction
Defensive Programming
● Assert (or Try/Catch)
● Unit Testing
○ What if no “useful” unit tests?
○ Numeric Unit Tests
● Automated Testing and Continuous Integration
○ (to be covered in the future)
Abstractions
● Computer Systems Researchers often talk about getting the right
“abstractions”
○ “Abstraction” decrease the complexity of your software by making the low-level details
hidden from the user
● Defining a convenient way to interact with your code base is hard!
○ Takes practice… cannot be quantified
○ What do you expose to the user (one of which will surely be yourself)?
Data Level Management
Save Raw and Intermediate Data
● Raw data D >> Intermediate Forms >> Result (yes or no)
○ You don’t just want to save the yes/no!
● Save Raw and Intermediate Forms
○ Saves time, extra processing, etc
Format Data Well
● Create data you wish to see in the world
○ Neatly labeled columns, information on format, etc
○ Important, especially if your data format changes down the road
● Space is cheap!
○ One variable per column, one observation per row, etc
○ Don’t cram!
Manage Your Metadata
● What is “Metadata”?
○ In short: Data about Data Set
● Might include date produced, units, etc
● You’ll need it later!
Publish Data
● (If you think others might want to use it)
● “Your data is as much a product of your research as the papers you write”
● Figshare, Dryad, Zenodo
Directory Level Management
Directory Names
● Your project should NOT be named “foo” or “a”
● Subdirectories should also be descriptive
○ Documentation in “docs”
○ Source in “src”
○ Scripts in “bin”
○ Etc…
● Should include a “data” and “results” folder
○ Make a distinction between what goes in each folder, as your results will surely contain
data!
○ Idea: every output goes in “results”, every input goes in “data”
Directory Names
❏ README
❏ LICENSE
❏ Tests
❏ testSightings.py
❏ data
❏ birdcount.csv
❏ doc
❏ notebook.md
❏ changelog.txt
❏ results
❏ summarized-results.csv
❏ src
❏ Sightings.py
Subdirectories (Don’t make too many)
❏ src
❏ helpers
❏ datastructs
❏ graph
❏ graphsearch
❏ methods
❏ dfs.py
Don’t Repeat Previous Work
● Use external libraries as much as possible
○ Optimized code and saves development time
● Use google, github, cppreference, etc
Project Level Management
Version Control
● Discussed Earlier This Semester
● Git, CVS, Mercurial, etc
○ Git preferred (Github, Bitbucket)
● Commit often, Commit early
● Don’t add large data dumps/files!
○ Makes version control slow, impractical
○ We will discuss later in semester how to manage this stuff
Adding Features, Refactoring
● Add features incrementally
○ Constantly check correctness
○ Don’t expect to add 1k+ lines and have your code work the first time
● Refactoring is a natural part of coding
○ Don’t avoid it
○ End up with bloated code
To use an IDE or not to use an IDE...
● I’m not sure!
○ What if like Microsoft Visual Studio, Eclipse, PyCharm?
○ Problem: code should be accessible to everyone
○ Getting libraries integrated into an IDE can be painful
■ For numeric libraries, even more annoying
■ Software makes this easier e.g. Intel Parallel Studio XE, Nividia NSIGHT, etc
○ If you’re prototyping and know IDE’s debugging and profiling tools well, why not
○ Mismatch between IDE environment and deployment environment
Issue-Tracking Software
● Common Mistake
○ “I need to refactor A, B, C and debug I, J, K
○ (One seminar and one nap later) “What was I supposed to do again?”
● Many out there (Wikipedia lists ~ 50)
○ Bugzilla, Apache Bloodhound, Planbox, etc etc
Working with Others
Industry vs Academia
● In industry, a group of experienced engineers is often assigned to manage
a single piece of software
● In academia, a single person might manage multiple pieces of software
Getting a Second Look
● Just as research ideas need a second look, so does a potential code base
● Pair Programming is extremely beneficial
○ Could be a problem if you’re the only one working on a project
● Coding with others ultimately makes you a better programmer
Documentation and Technical Writing
Create Barely Sufficient Documentation
● Somewhat covered earlier last semester
○ Documentation generation via Sphinx, Doxygen, etc
● You are writing the documentation for yourself as well as others!
Document All Work You’ve Done
● Not just the code you plan to release; code you’ve written but not used,
ideas you’ve tried (both successful and unsuccessful), etc
Reports and Papers
● Writing a paper or technical report? Put it under version control as well
● Formal Approach: Treat paper/report writing as programming.
● Save you time and effort town the road
Figures
● One script per figure
● Don’t manually change parameters; input them into functions
● Automation
○ Don’t be tempted to manually adjust window size and click the “save as” button in
MATLAB
Conclusions
Conclusions: Takeaways
● Following software engineering best practices saves development time,
headaches, and user-friendliness
● Developing (and maintaining) software is hard!
Conclusions: Questions
● Why put in all this effort if no one else is going to use my code?
● Considering the time spent improving non-essential parts of my code, will
the time saved from following best practices be greater than the extra
development time invested?

SS160 Sight Word Practice K 2
100% (8)
SS160 Sight Word Practice K 2
434 pages
Cognitive Behavioural Therapy For Insomnia CBT I Across The Life Span - 2022 - Baglioni - Front Matter
No ratings yet
Cognitive Behavioural Therapy For Insomnia CBT I Across The Life Span - 2022 - Baglioni - Front Matter
11 pages
Mathworks Interview Questions
100% (1)
Mathworks Interview Questions
5 pages
Liz Asset
No ratings yet
Liz Asset
29 pages
CG Lab Assignement
No ratings yet
CG Lab Assignement
17 pages
Web GPU
0% (1)
Web GPU
40 pages
Team Reference Document: Sharif University of Technology - Mkay
No ratings yet
Team Reference Document: Sharif University of Technology - Mkay
25 pages
Unit 2 Basic Optimization Techniques For Serial Code
No ratings yet
Unit 2 Basic Optimization Techniques For Serial Code
31 pages
CG
No ratings yet
CG
28 pages
Steady State Stability
No ratings yet
Steady State Stability
10 pages
BME303_Lab6_NinaSawaf
No ratings yet
BME303_Lab6_NinaSawaf
15 pages
HPC Unit 5 b
No ratings yet
HPC Unit 5 b
31 pages
Mid Term 1 - Solution
No ratings yet
Mid Term 1 - Solution
4 pages
Reflection of 2D Objects in Computer Graphics
No ratings yet
Reflection of 2D Objects in Computer Graphics
9 pages
pr4 New
No ratings yet
pr4 New
7 pages
Kactl
No ratings yet
Kactl
23 pages
Leetcode Pro Sheet
No ratings yet
Leetcode Pro Sheet
48 pages
Bisection in 1D, 2D, and 3D
No ratings yet
Bisection in 1D, 2D, and 3D
7 pages
Bsee21036 Oop Lab 13
No ratings yet
Bsee21036 Oop Lab 13
6 pages
FDMcode
No ratings yet
FDMcode
9 pages
Nimesh 1
No ratings yet
Nimesh 1
31 pages
Icpc Reference
100% (1)
Icpc Reference
30 pages
Samarth CG
No ratings yet
Samarth CG
30 pages
Ufc User Manual
No ratings yet
Ufc User Manual
131 pages
ANA Projects List
No ratings yet
ANA Projects List
44 pages
Vector 2 D
No ratings yet
Vector 2 D
155 pages
kactl
No ratings yet
kactl
15 pages
dipesh file .pdf
No ratings yet
dipesh file .pdf
44 pages
Computational Physics Problem Solving With Compute
No ratings yet
Computational Physics Problem Solving With Compute
11 pages
CPP LABPGM 1- 12
No ratings yet
CPP LABPGM 1- 12
20 pages
Code MTT
No ratings yet
Code MTT
18 pages
Ufc User Manual
No ratings yet
Ufc User Manual
141 pages
CG ALL PROGRAMS UPD
No ratings yet
CG ALL PROGRAMS UPD
31 pages
Basic Concept: All The Programs in This File Are Selected From
No ratings yet
Basic Concept: All The Programs in This File Are Selected From
26 pages
ABC
No ratings yet
ABC
28 pages
University of Calgary Team Reference Document: March 15, 2017
No ratings yet
University of Calgary Team Reference Document: March 15, 2017
23 pages
University of Calgary Team Reference Document: March 15, 2017
No ratings yet
University of Calgary Team Reference Document: March 15, 2017
23 pages
C 3 Edu
No ratings yet
C 3 Edu
5 pages
Lab Assignment Cg Ts
No ratings yet
Lab Assignment Cg Ts
10 pages
Using SVA For Scoreboarding and Testbench Design: Ben Cohen
No ratings yet
Using SVA For Scoreboarding and Testbench Design: Ben Cohen
4 pages
Graphics
No ratings yet
Graphics
26 pages
Ben Cohen - Using SVA For Scoreboarding and Testbench Design
No ratings yet
Ben Cohen - Using SVA For Scoreboarding and Testbench Design
4 pages
OOPs Practical Final
No ratings yet
OOPs Practical Final
27 pages
Cpp
No ratings yet
Cpp
17 pages
Numerical Modelling in Fortran: Day 6: Paul Tackley, 2017
No ratings yet
Numerical Modelling in Fortran: Day 6: Paul Tackley, 2017
53 pages
An Introduction To Programming in Matlab
No ratings yet
An Introduction To Programming in Matlab
12 pages
CG Lab Manual Comp
No ratings yet
CG Lab Manual Comp
27 pages
Numerical Methods For PDE in C++
No ratings yet
Numerical Methods For PDE in C++
38 pages
ADA LAB MANUAL 2022 SCHEME (1)
No ratings yet
ADA LAB MANUAL 2022 SCHEME (1)
28 pages
FFT Openmp
No ratings yet
FFT Openmp
11 pages
Matlab Tips
No ratings yet
Matlab Tips
14 pages
Developer Guide 5.0.3
No ratings yet
Developer Guide 5.0.3
166 pages
CG Lab File
No ratings yet
CG Lab File
50 pages
CG 6-9
No ratings yet
CG 6-9
15 pages
Sns Lab 4
No ratings yet
Sns Lab 4
12 pages
lec21-convex-hull-Convex hull-5
No ratings yet
lec21-convex-hull-Convex hull-5
11 pages
exp 8 cg_exp 9 cg_merged (1)
No ratings yet
exp 8 cg_exp 9 cg_merged (1)
8 pages
Quaccs 2014 MD Code
No ratings yet
Quaccs 2014 MD Code
3 pages
מונחה עצמים- נספח קוד להרצאה 2
No ratings yet
מונחה עצמים- נספח קוד להרצאה 2
4 pages
Applied Graph Theory File 7 Semester MCE: Submitted By:-Ankit Jain 2K14/MC/011 Batch R1
No ratings yet
Applied Graph Theory File 7 Semester MCE: Submitted By:-Ankit Jain 2K14/MC/011 Batch R1
35 pages
University of Toronto Faculty of Applied Science and Engineering Aps106 Midterm Ii - March 27, 2014
No ratings yet
University of Toronto Faculty of Applied Science and Engineering Aps106 Midterm Ii - March 27, 2014
5 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Sample 1727540025592
No ratings yet
Sample 1727540025592
1 page
HOUSE
No ratings yet
HOUSE
6 pages
file0150
No ratings yet
file0150
1 page
file0302
No ratings yet
file0302
13 pages
Asipaygov q w7yyq5p3vshavoohxb3w5ctpw6naxero
No ratings yet
Asipaygov q w7yyq5p3vshavoohxb3w5ctpw6naxero
2 pages
Maths Concept King by Gagan Pratap Sir Kocxhii PDF Convert Compress
No ratings yet
Maths Concept King by Gagan Pratap Sir Kocxhii PDF Convert Compress
1 page
Icpe Notes Chapter 1
No ratings yet
Icpe Notes Chapter 1
23 pages
GE6 Sample
No ratings yet
GE6 Sample
7 pages
OJT Presentation Rubric
No ratings yet
OJT Presentation Rubric
3 pages
Outline
No ratings yet
Outline
7 pages
12 Position Paper
No ratings yet
12 Position Paper
6 pages
Chapter 1 Assessment
No ratings yet
Chapter 1 Assessment
13 pages
Deadline_1k Sol Delhi University Updates Pãthâñ ?
No ratings yet
Deadline_1k Sol Delhi University Updates Pãthâñ ?
3 pages
COT Practices Rubrics
No ratings yet
COT Practices Rubrics
11 pages
Anthology
No ratings yet
Anthology
136 pages
FAC244 Financial Markets
No ratings yet
FAC244 Financial Markets
19 pages
Regarding Society Are True
No ratings yet
Regarding Society Are True
3 pages
A First Book of Quantum Field Theory (Amitabha Lahiri, Palash B. Pal) (Z-Library)
No ratings yet
A First Book of Quantum Field Theory (Amitabha Lahiri, Palash B. Pal) (Z-Library)
400 pages
Fletcher Cowden
No ratings yet
Fletcher Cowden
3 pages
E2E Project Charter
No ratings yet
E2E Project Charter
27 pages
Concept and Sources of Dharma - Vedas, Dharmsutras & Dharmashashtra
No ratings yet
Concept and Sources of Dharma - Vedas, Dharmsutras & Dharmashashtra
10 pages
The Ultimate Physics Reviewer
No ratings yet
The Ultimate Physics Reviewer
4 pages
Cornejo - SOCSC 11-PSY-L2 - Final Project Narrative
No ratings yet
Cornejo - SOCSC 11-PSY-L2 - Final Project Narrative
2 pages
Ix Maths Dec 2020
No ratings yet
Ix Maths Dec 2020
2 pages
Contesting the State The Dynamics of Resistance and Control Angela Hobart 2024 Scribd Download
100% (3)
Contesting the State The Dynamics of Resistance and Control Angela Hobart 2024 Scribd Download
81 pages
DLL 10 - Bioenergetics
100% (1)
DLL 10 - Bioenergetics
3 pages
Delicious Math 1
No ratings yet
Delicious Math 1
9 pages
Libya Overview of Hes Final
No ratings yet
Libya Overview of Hes Final
12 pages
Libraries As Catalysts in The New Normal Environment: Changes. Reforms. Transformations
No ratings yet
Libraries As Catalysts in The New Normal Environment: Changes. Reforms. Transformations
1 page
Korean Language 1B SB Sample
100% (1)
Korean Language 1B SB Sample
21 pages
TSES Short Version
No ratings yet
TSES Short Version
2 pages
Dokumen - Tips Sets Lesson Plan 2
No ratings yet
Dokumen - Tips Sets Lesson Plan 2
7 pages
Machine Learning Intelligence To Assess The Shear Capacity of Corroded Reinforced Concrete Beams
No ratings yet
Machine Learning Intelligence To Assess The Shear Capacity of Corroded Reinforced Concrete Beams
26 pages
Lecture4 NSTP
No ratings yet
Lecture4 NSTP
33 pages
Professional Development Learning Plan For Technology Template21
No ratings yet
Professional Development Learning Plan For Technology Template21
2 pages

codingPracticesSSW

Uploaded by

codingPracticesSSW

Uploaded by

Numerical Software

(Teach Yourself Programming in Ten Years by Peter Norvig)

/* Return -1 if bad input*/

void filterEigenVecs(Matrix M){ void filterEigVals(Matrix M){

void find_all_keys(keys K){ void findAllKeyrings(rings R){

You might also like