0% found this document useful (0 votes)
34 views

Session 5: A C Program For Straight Line Fitting To Data: 1st Year Computing For Engineering

1. The document discusses a C programming exercise to perform a weighted least squares fitting of experimental data to a straight line model. 2. It covers the necessary C programming concepts like arrays, reading/writing files, and command line arguments. 3. The exercise asks students to write a program that reads data from a file, performs a weighted least squares fit to determine the straight line model that best fits the data, and outputs the results.

Uploaded by

kvgpraveen107
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Session 5: A C Program For Straight Line Fitting To Data: 1st Year Computing For Engineering

1. The document discusses a C programming exercise to perform a weighted least squares fitting of experimental data to a straight line model. 2. It covers the necessary C programming concepts like arrays, reading/writing files, and command line arguments. 3. The exercise asks students to write a program that reads data from a file, performs a weighted least squares fit to determine the straight line model that best fits the data, and outputs the results.

Uploaded by

kvgpraveen107
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

1 of 11

1st Year Computing for Engineering

Session 5: A C program for Straight Line Fitting to Data


Michaelmas Term 1999 Lab Organizer: Prof D W Murray The aim of this session is to get you to fly solo using C. You will create a program to perform least squares fitting using a straight line to a set of experimental data. First you learn about (i) arrays, (ii) reading from a file, and (iii) using ``command line arguments''.

Summary of what you have to do


Arrays Using files Command line args Spend about 10 minutes on this Spend about 10 minutes on this Spend about 10 minutes on this

Understanding the theory Spend about 20 minutes on this EXERCISE 5 Spend about 60 minutes

1. Getting Started
After logging on and entering Openwin, type % /packages/demo/yr1/script5 % cd session5

2. Arrays
Arrays of objects are declared by specifying the number of them in the array in square brackets behind the variable name. For example,
int i, day[7]; float x[10]; char alphabet[26];

02/20/2008 11:01 AM

2 of 11

declares an integer array day of length 7, and so on. Note however, unlike some languages, that if you specify n items, the index runs from 0 to (n-1) inclusive, and NOT 1 to n. So in our example, day[0] to day[6] are legal uses, but day[7] is not. In some circumstances this can be a nuisance, and you may wish to declare
int day[8];

so that you can use day[7] in your code. (But reading this declaration could cause confusion for someone else. Perhaps a more explanatory way of declaring would be int day[7+1]; which makes it clearer that you are interested only seven days, but that you wish to start on day[1]. Often the sizes of several arrays are the same, for example, when dealing with x,y points on a graph. It is then convenient to ``define'' a size. For example,
#include <stdio.h> #define MAXPOINTS 100 void main() { float x[MAXPOINTS+1], y[MAXPOINTS+1]; ...

/* elements [1] to [MAXPOINTS] */

Now if you need to handle more points, you need only change the definition of MAXPOINTS, and not search through all your code for specific numbers. Array very often appear inside loops, indeed whenever you want to perform the same operation on every element of the array:
#include <stdio.h> #define MAXPOINTS 100 void main() { int i; double x[MAXPOINTS+1], /* elements [1] to [MAXPOINTS] */ y[MAXPOINTS+1], modsq[MAXPOINTS+1]; ... etc ... for(i=1;i<=MAXPOINTS;i++) { modsq[i] = x[i]*x[i] + y[i]*y[i] ; } }

2.1 Arrays allocated at runtime (for info) The arrays we have used so far have had their size declared in the program. Often it is convenient and

02/20/2008 11:01 AM

3 of 11

economical with memory to set the size of an array while the program runs. You are recommended not to do this for this practical, but here is an example:
#include <stdio.h> #include <stdlib.h> void main() { int i,n_items; double *x; printf("Supply the length: "); scanf("%d",&n_items) x=(double *)calloc(n_items,sizeof(double)); for(i=0;i < n_items;i++) { x[i] = 2.0*(double)i; } }

2.2 Multi-dimensional arrays (for info) Multidimensional arrays are declared and used in the way you would guess. The range of all indices is from 0 to (n-1).
void main() { double x[5][10]; double p[32][4][2]; x[0][0] = 4.0; x[4][9] = 3.0; x[5][8] = 3.2; /* Error! WHY? */ }

There is a much nicer way of building multi-dimensional arrays using calloc , but this is beyond the scope of this lab.

mini-EXERCISE 5A
1. Compile, link and run the program arrays.c which you will find already in the session5 directory. 2. Check it makes sense!

3. File opening and closing


So far if we have needed to read or write to files we have used the redirect arrows.

02/20/2008 11:01 AM

4 of 11

It is however straightforward to open and close files from within a program using the functions fopen() and fclose() . The fopen() function returns a ``file pointer'', which must be declared. Items can be read from a text file using fscanf() , which is very much like scanf() , but it has the file pointer as an argument. 3.1 An examplw with just reading The following is an example of reading. Notice that fscanf() takes the filepointer not the filename as an argument, and returns the number of items read.
#include <stdio.h> void error(char *string) { /* writes an error message, and exits program */ fprintf(stderr,"\nERROR * %s\n",string); exit(0); }

void main() { /* declare the file pointer */ FILE *fpin; float a,b; int n_items_read; char filename[50];

printf("Supply filename: "); scanf("%s",filename); /* open the file for reading, and attach the pointer. * Then check that the fpin is not null * If it is, exit the program with a message */

fpin = fopen(filename,"r"); if (fpin == NULL) { error("Cannot read file for some reason"); } /* read a line from the file */ n_items_read=fscanf(fpin,"%f %f",&a,&b); /* now close the file */ fclose(fpin);

02/20/2008 11:01 AM

5 of 11

/* type useful diagnostics the screen * then type a and b to the screen */ printf("Read %d items from file %s\n",n_items_read,filename); printf("Value of a is %f and value of b is %f \n",a,b); }

3.2 Reading and Writing This next example shows reading and writing. For brevity, the checks whether the files are readable and writeable are omitted.
#include <stdio.h> void main() { FILE *fpin, *fpout; char filename1[50],filename2[50]; float a,b; printf("Supply input filename : "); scanf("%s",filename1); printf("Supply output filename: "); scanf("%s",filename2); fpin = fopen(filename1,"r"); fpout = fopen(filename2,"w"); fscanf(fpin,"%f %f",&a,&b); fclose(fpin); fprintf(fpout,"Value of a is %f fclose(fpout); } and value of b is %f \n",a,b);

mini-EXERCISE 5B
1. Compile, link and run the programs fread.c and freadwrite.c which you will find already in the session5 directory. You will find a data file myfile.dat for your use. 2. Try also giving a bogus filename to check the error routine.

3.3 How to detect the reading is finished Suppose you wanted to read a file where each line had a similar content, but where you did not know how
02/20/2008 11:01 AM

6 of 11

many lines there were in total. How would you stop reading? One way is to test whether fscanf() has read the expected number of items, as follows. Why is n_points = i-1 .
#include <stdio.h> #define MAXPOINTS 100 void main() { /* declare the file pointer */ FILE *fpin; char filename[50]; float x[MAXPOINTS+1], y[MAXPOINTS+1]; int i,n_points; printf("Supply input filename : "); scanf("%s",filename); fpin = fopen(filename,"r"); i=1; while (fscanf(fpin,"%f %f",&x[i],&y[i]) == 2) { i++; } fclose(fpin); n_points = i-1; ... blah blah blah ... }

4. Command line arguments


In the example above, the program asked for a couple of filenames to be typed in. A more convenient way of giving a program parameters is via the command line. For example it would have been convenient to type % readwrite myfile.dat file.out A few small edits to readwrite.c will do this. Edit the file as shown below:
#include <stdio.h>

void main(int argc, char **argv) /* <--- Edit */ { /* declare the file pointer */ FILE *fpin, *fpout; /* Delete the filename declaration <--- Delete float a,b;

*/

02/20/2008 11:01 AM

7 of 11

/* open the file fpin for reading, and attach the pointer */ fpin = fopen(argv[1],"r"); /* <--- Edit */ /* open the file fpout for writing, and attach the pointer */ fpout = fopen(argv[2],"w"); /* <--- Edit */ ... rest unchanged! ... }

Once main() is declared in this new way, inside the main routine argc gives the number of arguments INCLUDING the command. (So argc above equals 3, NOT 2.) argv is an array of strings containing the arguments. This array starts at 0, and argv[0] is the command name itself (readwrite in the example). So, the input filename is contained in argv[1] and the output filename in argv[2]. At this point you have all the equipment to carry out the main exercise

EXERCISE 5C: Least Squares Fitting


You are going to write a program that performs a weighted least squares fit of a straight line to a file of data. To start % cd ~/exercise5 Task 1: Understanding Weighted Least-Squares Fitting Often in experimentation, you will measure the variation of one quantity, y say, as another, x is changed under your control. For example you might set the current through a device to a number of different values, and measure the voltage across the device. In such measurements, x is called the independent variable, and y is the dependent variable. Whereas x is assumed to be error-free, measurements of y are assumed to have an error characterised by the standard-deviation in y. You end up then with a set of data points xi, yi, for each of the measurements i=1,...,n. Having made measurements, you will want to test whether some physical model explains the data. For example, does the voltage vs current data from the device suggest that it is behaving as a resistor? If so, what is its resistance? A way of doing this is to devise a model with a number of variable parameters, and then to vary the parameters so that they best fit the data. That is, if the parameters are p0, p1, ... , pn, we devise a model function f(p0,p1,...,

02/20/2008 11:01 AM

8 of 11

pn, x). When we put in one of our chosen xi, we would like corresponding measured yi.

to be close to the

Of course, because our measurements yi are likely to be in error, we cannot hope to fit all the measurements exactly, and so the fit is a matter of optimizing the overall likelihood of the interpretation given the data. If certain statistical pre-conditions are satisfied, it may be shown that this is achieved by varying the parameters until we find the minimum of the sums of the squares of the deviations

is found. Obviously this tries to make fit go through the measured points. In fact, we can do a bit better than this if we know the weight to give to individual yi values. We find instead

Notice that it tries harder to go through points where the weight wi is high. This is called a weighted least-squares fit. What is wi in all this? If the standard deviation on a measurement is small, then the weight is high, and vice versa. In fact

Weighted least-squares fitting to a straight line

Instead of a general function, suppose the model is a straight line,

02/20/2008 11:01 AM

9 of 11

where there are two parameters, p0 and p1. We are after p0 and p1 such that

is minimized. There are a number of ways to achieve the minimization, but we will choose a particularly straightforward one. If C is a minimimum, then both and must be zero. Now

and

We can lose the factor of 2, and break up the summations:

or

where the S terms represent the summations. But these are just a pair of simulaneous equations in p0 and p1 with solutions

So, to summarize ... ... all we need to make the least squares fit is 1. to loop through the data accumulating the sums

and so on

2. then compute p0 and p1 using the solutions to the simultaneous equations.

02/20/2008 11:01 AM

10 of 11

Task 2: Design, then write your program Look at file xys.dat. Each line contains, in floating point format, an x, y and value.

Design a program to read a file of such data, and to fit a straight line to it. The program should be usable by typing at the command line % slf filename and should print out an error message if the filename is not specified, or if the file cannot be read. Think about the steps required. (If you use the reminders, use the browser's "Back" button to return here afterwards.) Open the file given as a command line argument Reminder Read the data in x,y, sigma arrays Close the file Work out the summations (you'll need a loop) Work out the parameters p0 and p1 Type out the results to screen Please ask a demonstrator for advice if you get stuck. Task 3: Plot the data and your fit Type the following % gnuplot Then after the gnuplot prompt, type gnuplot> plot "xys.dat" with errorbars, P0+P1*x where P0 and P1 are replaced by the values from your fit. Task 4: Change a weight, re-fit, re-plot Edit one of the standard deviations in xys.dat --- you might make one very small --- and see how the fit is affected by re-plotting. Finally, get signed out by a demonstrator Reminder(1) Reminder(2) Reminder

02/20/2008 11:01 AM

11 of 11

IMPORTANT! Please note that this is the last laboratory in the Michaelmas part of the 1st Year Computing Laboratory. Your next lab session will be part of the Design Build and Test exercise, and the demonstrators there CANNOT sign this exercise off. If you do not complete during this session, look out for a notice on the 1st year notice board telling you when the final marking session will be held.

Logging out
As ever exit openwin And then don't forget to logout from the console Lab devised by: David Murray Lab Organizer: David Murray Last changed June 16th, 1999 p

02/20/2008 11:01 AM

You might also like