0% found this document useful (0 votes)
6 views

OverviewOnFiles Up

Uploaded by

csoundes2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

OverviewOnFiles Up

Uploaded by

csoundes2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

File Structure and Data Structure : FSDS

Chapter 1 :Files and Storage


technologies

DR. LAKEHAL ABDERRAHIM

Email : [email protected] 1
STORING MEDIA
There are different levels of storing media:

1- PROCESSOR Very fast (a few ns), volatile, size of a few machine words.
REGISTERS

2-CACHE MEMORY BETWEEN Faster than MM (a few tens of ns), volatile, very small size.
PROCESSOR AND MAIN MEMORY
(MM)

Fast (a few hundred ns), volatile, relatively small size.


3-MAIN MEMORY (MM)

4-BUFFER ZONE (PLAYING THE ROLE


This is a small part of the main memory reserved by the system ->
OF CACHE) BETWEEN MM AND "buffer cache"
SECONDARY STORAGE (SS)

5-DIFFERENT TYPES OF SECONDARY Mainly magnetic disks (or hard drives HDD) and flash-disks
STORAGE (SS) (SSD...), slower than MC, non-volatile, large size and less
expensive than MC.
Registe
r
CPU Primary
storage &
Caches low level

Fast access Main


Decrease in storage capacity memory(MM)
Price increase
-RAM-
Slow access
Solid State Disque
Increase in storage
capacity
SSD
Price decrease

Magnetic Storage (HDD) Secondar


y storage
MEMORIES (SS)
Hierarchy and characteristics Optical storage (CD, DVD)

Archiving memories
MEMOIRES
Low level memories
Flip-Flops: Responsible for storing and maintaining a single binary bit of information
until a new input signal updates their state.

Register: Refers to a small high-speed storage unit within the central processing unit
(CPU) of a computer, used to temporarily hold data that is actively being processed or
manipulated during computational tasks.

Caches memories
Level 1 Cache: A very small, extremely fast memory unit located
directly within the central processing unit (CPU) that stores
frequently accessed instructions and data to speed up
processing.
CPU
Level 2 Cache: A larger memory unit located between the
central processing unit (CPU) and the main memory, serving as Cache Cache Cache Main
an intermediate storage space for frequently accessed data, L1 L2 L3 Memory
thereby improving overall system performance.

Level 3 Cache: A larger, shared cache memory unit that is


located beyond the central processing unit (CPU) and L2 cache,
typically on the same chip or within the same package as the
CPU, designed to further enhance data access and system
performance for multiple CPU cores or processors. 4
Memories
Random Access Memory (RAM) –Main Memory(MM)-
A volatile computer memory that stores data in the form of electrical charges and
DRAM (Dynamic RAM) requires constant refreshing to maintain it

Unlike dynamic memory, it does not need to periodically refresh its content.
SRAM (Static RAM) Like dynamic memory, it is volatile (In the primary role, SRAM serves as
cache memory, acting as an interface between DRAM and the CPU) →
(faster than DRAM).
SDR SDRAM (Single Data
1993 A high-speed computer memory that transfers data at every clock cycle
Rate Synchronous Dynamic
RAM)
Double Data Rate is a type of computer memory technology that
DDR SDRAM (Double Data 2000 transfers data on both the rising and falling edges of the clock
Rate Synchronous signal, effectively doubling the data transfer rate compared to
DDR1, DDR2, SDR (Single Data Rate) memory
Dynamic RAM)
DDR*

It is an energy-efficient memory technology for mobile


LP DDR SDRAM (Low Power 2009 devices, reducing power consumption while maintaining
DDR) decent performance.
Memories

Secondary storage (SS)


A method of data storage that uses magnetic patterns on a medium, such as a hard
Magnetic storage disk drive (HDD) or magnetic tape, to store and retrieve digital information.

Solid-state storage is a data storage technology that uses NAND flash


Solid-State storage
memory for faster performance and better durability than magnetic hard drives
(SSD)

Uses lasers to read and write data on optical discs, such as CDs, DVDs,
Optical storage and Blu-ray discs, for the purpose of storing and retrieving digital
information

Refers to additional devices, such as external hard drives, USB


External storage flash drives, or network-attached storage (NAS), which increase
storage capacity for backup, file sharing, and data expansion."

A service that allows users to store, access, and manage their


Cloud storage digital data and files remotely on servers managed by cloud
service providers via the Internet.
Memories
Others

❑ Read only memory (ROM):A type of non-volatile computer memory containing


permanent data or instructions that cannot be easily modified or erased by
computer instructions

BIOS (basic input/output system)

EPROM (erasable programable ROM)

EEPROM (electrically erasable programable ROM)

❑ Virtual Memory: Virtual memory is a memory management


technique used by computer operating systems that combines
the physical RAM (random access memory) of a computer
with temporary disk space to effectively increase the available
memory for execution.
Comparison of Storage media types
Characteristics

Characteristic Registers Cache Memory Main Memory Secondary Storage


Faster than main Slower than cache
memory (< 100 memory (> 100
Speed Fastest (< 10 nanoseconds) nanoseconds) nanoseconds) Slowest (~ milliseconds)

Larger than registers


Size Very small (1-10 bytes) (1-10 MB) Large (2-100 GB) Very large (GB-TB)
More expensive than Less expensive than
Cost Most expensive main memory cache memory Least expensive

Between the processor


Location Inside the processor and main memory Inside the computer Outside the computer
Data and instructions Data and instructions of
Data and instructions likely to be used Data and instructions programs not currently
Usage currently in use frequently of running programs running

Type Volatile Permanent

8
THE HISTORY OF STORAGE TECHNOLOGIES

Punched cards Magnetic tape Hard drive (HDD) Floppy Disks


• 1950s • 1956(IBM) • 1970-2000
• 19th century
• Sequential data • Use rotating • Poratable storage
• Mechanical disks.
acess storage. with low storage
capacity.

Cloud
• 2000 CD & DVD
• Storage solution Solid State USB & microSD • 1980-2000
accessible via the Drive(SSD) • 2000 • Optical storage for
internet
• 1970 ~ 2005 • Removable storage for software, music,
• Based on NAND flash saving, transferring, and with larger capacity
memory, faster and more transporting data
durable than HDDs

9
TYPES OF STORAGE MEDIA Hard drive (HDD)

Platter
❑ Disks: A disk is a magnetic disk covered with a thin layer of magnetic Motor

particles (platters). Data is stored in the form of bits on the disk. Read/Write

❑ Platter: A circular disk made of a magnetic material where data is stored. Actuator

❑ Track: A concentric circle on a platter where data is stored.


❑ Sector: A portion of a track where data is stored.
❑ Cylinder: A vertical stack of tracks that align across all platters. Interface
❑ Read/Write Head: The read/write head is an electronic component that Rider
Power s
reads and writes data on the disk. Located on both sides of each platter.
supply
❑ Disk arm: A mechanical arm that holds the read/write heads and moves
them across the platters.
❑ Motor: A device that spins the platters at a constant speed.
❑ Controller: A circuit that controls the movement of the read/write heads Track 0
Track 1
and manages the transfer of data between the HDD and the computer.

10
TYPES OF STORAGE MEDIA Hard drive (HDD)

11
TYPES OF STORAGE MEDIA

Solid State Driver (SSD)

(on the back)

Solid State Driver (SSD)

It's an external storage device (SS) based on


NAND flash memory, robust and offering better
performance than magnetic disk. Composed of:

Flash memory: Flash memory is a type of


non-volatile memory that can be used to store
data permanently.

Controller: The controller is an integrated circuit


that manages the transfer of data between the
flash memory and the computer.

12
TYPES OF STORAGE MEDIA
Solid State Driver (SSD)

❑ Data is stored in microchips.

❑ The memory is divided into blocks (e.g., 256 KB) and each
block is composed of a certain number of pages (e.g., 4 KB).

….
BLOC
There are 3 possible operations : Page Page Page Page Page 64
K
1 2 3 4
❑ Read a page → very fast (20 microseconds)

❑ Write a page, provided if it is in an erased state → fast (100 to 200 microseconds)

❑ Erase all pages of a block → slow (a few milliseconds)

13
TYPES OF STORAGE MEDIA
….
BLOCK
Page Page Page Page Page 64
1 2 3 4
❑ Blocks support a fixed number of erasures, beyond which they become unusable. It is therefore

necessary to distribute erasures uniformly across all blocks of the disk ⇒ This is the

WearLeveling technique.

❑ When updating the content of a page, it is preferable to write the new content to a new page

(already erased).

❑ Physical page numbers must therefore be hidden from users. ⇒ This is the role of the FTL (Flash

Translation Layer)
EXTERNAL MEMORY ACCESS INTERFACE
Regardless of the type of disk (HDD or SSD), the interface remains the same: The unit of transfer = a
physical block

Prog_1 Prog_2 Prog_3


Bloc _0 Bloc _1 Bloc _2 Bloc _3

Prog_4 Prog_5
Physical Bloc _4 Bloc _5 Bloc _6 Bloc _7
I/O

Buffer _1 Buffer _2 Bloc _... Bloc _... Bloc _... Bloc _...
ReadBloc(i)
WriteBloc(i,X)
Buffer
Buffer _3 Bloc
_m Bloc _... Bloc _... Bloc _...
_N-1
Cache
Main Memory Secondary Storage

❑ The common interfaces for HDD disks are SATA


❑ The common interfaces for SSD disks are SATA, NVMe. 15
EXTERNAL MEMORY ACCESS INTERFACE

HDD SATA cable SATA Port

SSD NVMe M2 Port


HDD VS SSD

Characteristic HDD Hard Drive SSD

Storage Capacity High Low

Speed Low High

Noise and Vibrations Noisy and Vibrating Silent and Non-Vibrating

Durability Less Durable More Durable

Cost Less Expensive More Expensive

17
Power
Memory Connector
Processor Slots

Connectors for Peripherals:


Keyboards, Mice, USB, etc
Socket
CPU

FS
B
Clock BIOS

ARCHITECTURE Generator
Daughter

S-ATA Storage
Cards

(Hard Drives, DVD Readers,


etc.)
DATA LOADING
RAM Memory

Data from the hard drive is loaded Once the data is in RAM, the
into RAM. CPU can access it.

Hard Drive Processor

19
DATA STRUCTURE

❑ A data structure is an organized arrangement of different data that takes into account not only the
recorded entities themselves, but also their interconnections, based on their usage context, such as
how a student is linked to grades, and patients to treatments.

Student Patient
Name String Name String

Modules Table or List of String Medical tests & report Liste (String and float )
Prescriptions and doses Liste (String and float)
Notes Table or List of float

20
DATA STRUCTURE

Primitive Data Structure:

❑ Represents the basic structure of data in computer


science.

❑ Available in most programming languages as built-in


types.

❑ Primitive data is directly manipulated by machine


instructions.
Non-Primitive Data Structure :

❑ More sophisticated data structures, directly derived from


primitive data.

❑ Deals with the structuring of a group of complex


homogeneous and/or heterogeneous data elements. 21
GLOBAL OVERVIEW OF DATA STRUCTURING DURING STORAGE AND
MANIPULATION

Storage

Output processing
Storage

Data Structure

Interne Memory

Externes
Raw Data
Human Machine -

Input processing

File Structure
Interface

Users Retrieve 22
FILES
A file is the concept through which a program or application stores data in external memory.

Files representation
Regardless of the file type, it is used at different levels of abstraction with different semantics.

Logical level
(Applicatif)

Physical level
(Interne)

Logical Level of Files Physical Level of Files


Stored in binary form without a defined
Data Organization Structured according to a logical schema (record, table, list,...)
structure
Data Operations Logical operations (SQL, queries, search and replace,...) Physical operations (read/write)
Area of Interest Structure, meaning, queries, instruction,... Storage, space management, access
Example Relational database, files (Excel, CSV,...), Files on a hard drive, USB,...

Users Developers, application users (data entry agent, doctor,...) Operating systems, file managers
23
NOTIONS OF FILES AT THE LOGICAL LEVEL
File: A file at the logical level (application, "high-level programming language") is considered as:

❑ A large list of logically linked information, addressed in the form of records composed of different fields (typed).

• Fields : Primitive data characterized by a size, length and type

String Name;
int Age; // Integer (size of 32 bits)

• Record : Collection of fields with a logical relationship.

struct Student { // Structure named student


String Name;
int Age;
char Num_Student[20]; // Table of characters (ID)}

❑ Or a stream of raw bytes without a well-defined structure. These streams generalize the notion of I/O to be independent of the
type of device used (Non-typed).

24
NOTIONS OF FILES AT THE LOGICAL LEVEL

Typed Files (Sequence of Records) Non-Typed Files (Byte Stream)

Data Structure Structured in records with defined fields Raw byte stream without defined structure

Data Types Specific data types for each field No explicit specification of data types

Easier to interpret due to the defined structure, even with basic software More difficult to interpret due to the absence of structure:
Ease of Interpretation
(Notepad) requires dedicated software (dev++ to run the executable)

Flexibility Less flexible in terms of data types More flexible for storing various data types

Example Relational database, CSV, Excel files Binary files (images, audio, executables), text files

25
NOTIONS OF FILES AT THE LOGICAL LEVEL

In the case of typed files, records are formed by a set of fields (or attributes). Among these fields, one or more can
play the role of search key .

Student
ID_etudiant String
Search
Name String
Key
Age Integer
Modules Table or liste of String
Marks Table or list of float

Example : Search for students over 20 years old and group them in the same section.
26
NOTIONS OF FILES AT THE
PHYSICAL LEVEL

At the physical level, a file in a file management system (FMS) is composed of


multiple blocks, where each block contains a sequence of uninterpreted bytes.
Information, such as records, is stored inside these blocks according to a
predefined structure.

Rec Rec
Rec 1 Rec 4 Rec 7
10 n-2
Rec Rec
Rec 2 Rec 5 Rec 8
11 n-1
Rec
Rec 3 Rec 6 Rec 9 Rec n
12

File = Collection of Physical


Blocs

Access to the contents of the blocks is done through input/output (I/O) operations. 27
NOTIONS OF FILES AT THE PHYSICAL LEVEL:
FILES IN AN FMS
Concept Description
- Files are divided into data blocks of fixed or variable
size.
- Each block is identified by a block number or a
physical address.
Data Blocks
- Files are spread across multiple blocks due to their
size.
- The data of a file is stored sequentially in the order of
addition.
- Each FMS has its own structure for organizing files
and data blocks.
File Management - Stores information such as the location of blocks,
System (FMS) authorizations, and metadata.
- The bytes in the data blocks are not interpreted by the
file system.
- The FMS does not know if the data is text, images,
etc.
Uninterpreted Bytes
- The interpretation of the data is the responsibility of the
applications that read the files. Note: The NTFS file system of Windows uses
- FMSs manage the allocation and deallocation of block MFT (Master File Table) to store metadata about
Space Management spaces. files (characteristics and information).
- They maintain a table to track the blocks used by files. 28
OVERVIEW ON FILE CHARACTERISTICS

❑ Files characteristics

Characteristic Description
File Name Unique identifier of the file.
File Extension Indicates the type/format of the file content.
Size Quantity of data contained in the file.
Content Type Nature of the data (text, image, audio, etc.).
Location The physical location where the file is stored.
Creation/Modification Date Timestamp of creation and last modification.
Access Permissions Controls access to the file (read, write).
File Formats Specific format defining the structure.
Program Associations Associated program for opening the file.

29
FILE CHARACTERISTIC MANAGEMENT

❑ Information Storage : This information is stored in reserved locations on the disk. For example, a characteristic
table can be stored at a fixed address on the disk.

❑ Information Retrieval : When an application wants to open a file, the system retrieves its information from the
characteristic table.

❑ Usage in Sequential Secondary memory: In sequential storage media like magnetic tapes, this characteristic
information was generally placed at the beginning of each file, called a header block.

❑ Specific Characteristics : Certain applications also need to manage specific characteristics for manipulating their
data files. These characteristics can be stored at the beginning of the file, before the data.
30
FILE OPERATIONS AND ACTIVITY

❑ File Operations : Represents the flow of input/output of file data between the processor, RAM, and secondary storage :
❑ Creating a file.
❑ Inserting records or fields into a file.
❑ Modifying data within the file.
❑ Reading data from a file.
❑ Deleting a record or a file.
❑ Merging files or splitting a file

❑ File Activity : A set of indicators for all the operations and actions that can be performed on a file.

Description
Consultation Rate Frequency of file reading by users or applications.
Renewal Rate Frequency of file updates or modifications.
Popularity Relevance and usefulness of the file for users or applications.
Version History Number of different versions of the file created over time.
Consultation Time Duration during which the file is open or read.
Parallel Access Number of users or applications simultaneously accessing the file.
Backups and Restaurations Frequency of file backups to prevent data loss. 31
BUFFER CACHE

All operations at the application level are translated into low-level operations to access the physical blocks of the storage
memory(SM). slow !!!!!!!!!!!!

Prog 1
Use A A D
The system maintains a buffer cache in main memory to temporarily A A A A
store copies of physical blocks, using strategies to choose the most Prog 2 B C
relevant ones Use B A
A C B B B
Prog 1
Use B B B
❑ During physical reading, the system copies the block into the buffer Main Memory Secondary Memory
area. If there is no space, it can overwrite an existing block by first
saving it to disk, depending on the replacement strategy. I/O physical

❑ Updates to records by applications are first made in the buffer area in main memory. Physical writing to disk is deferred until later
(for example, during synchronization).

Note : To prevent data loss in case of a failure, the system synchronizes its cache with secondary storage by
periodically writing all modified blocks in main memory.
32
SUMMARY OF THE INTERACTION BETWEEN LOGICAL AND PHYSICAL
LEVELS

Physical level

MM
Processor
7 6 F F 1. Application Instruction
1 1
Caches 2. Operating System (OS) and

Applicatif level F1 Communication with SGF


F F
2 2 3. Creation of a Buffer
I/O
(S
ys 2
F2 tem
3 4. File Localization and
ca
ll)
5 Verification of Characteristics
F3 S Operating System
5. Transfer of Data Blocks to
1 F FM FF F F F
3 3 3 3 1 1 1 Files Management System the Buffer
(FMS)
F F F
6,7. Communication Between
1 3 3
F F F 4 Application, Processor, and
2 2 2
MM.

33
FILE STRUCTURES: DESCRIPTION

❑ A file structure is a fundamental concept in the


field of file management.

❑ It allows for the conceptualization and definition


of specific data structures and algorithms to
efficiently manage data stored in a file within a
file management system (FMS). (Windows:
NTFS, FAT; Linux: inode)

❑ It serves to optimize data access performance,


both in terms of execution time and memory
usage.

34
FILE STRUCTURES: ACCESS AND FUNCTIONS
Bloc 1 Rec 1
Rec 2 Ahmed
Rec 3 Benarab
19 y
F3 F3 F3 F3 F1 F1 F1
Bloc 4 Leila
F2 F3 F3 Soualmi
Rec 13 26 y
F2 F2 F2 F1
Rec 14
F3

• File Block Organization on Storage Media


This organization can vary depending on the SGF used and the desired performance.
• Placement of Records Within Blocks
Can specify the block size, the way to store records (sequential, random, etc.), and the
metadata associated with each record.
• Necessary Characteristics and Information for Manipulating the File
• Number of Buffers to Reserve in Main Memory
Defines how many buffers must be reserved in main memory to optimize data access.
• Implementation of Access Operations
Search, insertion, deletion, update algorithms, etc.
→ Optimization of access performance
Minimize execution times of access operations and use main memory efficiently to avoid
frequent accesses to storage media.
35
FILE STRUCTURES: PERFORMANCE

When dealing with files and optimizing performance, two main criteria are generally taken into account: the
number of physical Input/Output operations (I/O) and the occupation of memory space.

Performance Criterion Description Optimal Objective

Number of Input/Output Number of read/write operations to/from physical storage Minimize the number of I/O operations to
(I/O) (hard drive, SSD, etc.). improve performance.

Utilization of central memory by data structures and file


Maximize the efficiency of memory
Memory Occupation processing operations.
utilization (close to 100%).
Size of the used data (Records) / Size of stored data

36
ABSTRACT MACHINE FOR THE
CONCEPTUALIZATION OF FILE MANIPULATION IN
MEMORY
User HMI hardware Hardware

- OS
- Driver
- FMS
Absract
Machine -Programs
Open
Allocate -Data Bases
Write Management
Characteristics
.
.
Close

How ?
❑ Is a conceptual (virtual) model that describes how file management
operations are performed within a computer information system.

❑ This abstract machine often simplifies hardware and software complexities


(algorithms) to focus on the fundamental aspects of file manipulation. 37
ABSTRACT MACHINE FOR FILE MANIPULATION IN
MEMORY: ABSTRACT DEFINITION OF A FILE
T3 T1
F3 F3 F3 F3 F1 F1 F1
CONST MaxE=10; // Maximum number of records
F2 F3 F3 F3 T4 Type Tenreg = Struct // Content of the record
F2 F2 F2
field1: Typeqq
field2: Typeqq …
T2 end

❑ F1 A file of a vector of BLOCKS in the BUFFER T1. Type TBloc = Struct // Content of the bloc
tab : Table [MaxE] of Tenreg // Table of record
NE : integer // Number of records 0<NE<MaxE
❑ F2 A file of list of BLOCKS in the BUFFER T2
End.
❑ F3 A file of a vector of BLOCKS in the BUFFERS T3,
F3 : FILE of TBloc BUFFER buff HEADER(type1,
T4
type2, ...typem);
❑ For each file, there are header-blocks to ensure
proper manipulation of files and these data blocks.

38
ABSTRACT MACHINE FOR FILE MANIPULATION IN
MEMORY: MODELS AND ALGORITHMS

❑ To write algorithms on file structures, we will use the abstract machine defined by
the following model:

{Open, Close, ReadDir, WriteDir, Set_Header, Header, Allocblock }

❑ In this model, we manipulate block numbers relative to the beginning of each file
(these are therefore virtual numbers (VCN)).

❑ The use of physical addresses is not particularly useful at this level of presentation.

❑ In this model, a file is therefore a set of virtually numbered blocks (1, 2, 3, ...n).

39
ABSTRACT MACHINE FOR FILE MANIPULATION IN
MEMORY: MODELS AND ALGORITHMS

Opens or creates a file.

Mode 'A': Opens an existing file for both reading and writing. The term 'A' likely stands for 'Ancien'.
Open( F , Filename , mode )

Mode 'N': Creates a new file for reading and writing. The term 'N' stands for ‘New’.
The characteristics are allocated in the MM during the creation of the file.
Closes the file.
Close( F )
File characteristics are saved in secondary storage ‘SS'

ReadDIR( F , i , buf ) Reads the contents of the i-th block of file 'F' and stores it in the variable 'buf'

WriteDIR( F , i , buf ) Writes the contents of the variable 'buf' into the i-th block of file 'F'.
Returns the value of the i-th characteristic associated with file 'F'. This could be information like file size,
Header( F , i )
creation date, permissions, etc.
Set_ENTETE( F , i , v ) Sets the value of the i-th characteristic of file 'F' to 'v'. This allows you to modify file attributes.
Allocates a new block to file 'F' and returns the number of the newly allocated block. This is used for extending
ALLOC_BLOC( F )
a file.

40
ABSTRACT MACHINE FOR FILE MANIPULATION IN
MEMORY: ALGORITHMS (EXAMPLE)
CONST MaxE=10; // Maximum number of records

Type Treatment = Struct


Medicament: string;
Dose : float;
Periode : string;
fin

Type TPatient = Struct // Content of record


Name : String
Tests : float;
Treatment_doses: Treattement[10]
fin

Type TBloc = Struct // Content of blocks


tab : Tableau [ MaxE ] of Tpatient
NE : integer // Number of occupied records
fin.

Patients : FILE of TBloc BUFFER T1, T2,.. HEADER (NE1 : entier , NE2: entier );
41
ABSTRACT MACHINE FOR FILE MANIPULATION IN
MEMORY: ALGORITHMS (EXAMPLE)
//Print the content of file
BEGIN

Open ( F , « Patients.dat », ’A’ ); // opening in ancien mode


Nb_Bloc ← HEADER (F , 1); // Retrieve the characteristic that stors the number of blocks within the file
i ← 1;
WHILE i <= Nb_Bloc DO // scan the file block per block
ReadDIR ( F , i , Buf ); //read block i in variable Buff
j ← 1;
WHILE j <= Buf.nb DO // Read the content of the block (the vector of records)
Val ← Buf.tab[j];
write (Val);
j++;
ENDWHILE.
i++;
ENDWHILE.
Close ( F )
END.
42
FILES IN C/C++ LANGUAGE

43
FILES IN THE C LANGUAGE:
DEFINITION
❑ When implemented by the execution support, a file is seen as a linear
sequence of bytes without a particular structure.
❑ The elements of a file are therefore not typed, it is up to the programmer to
ensure their management.
❑ Files in C are manipulated by the <stdio.h> library.

Text file Binary file

Text files contain human-readable ASCII or Unicode characters, Binary files can contain any sequence of bytes, including data that is
typically organized in lines of text. not human-readable.
Larger than binary files for storing the same amount of More compact because they store raw data without any character
information. overhead.
Text files can be easily read and modified by humans using text
Binary files are not human-readable and often require specific software
editors.
to be interpreted.. 44
FILES IN THE C LANGUAGE: OPENING AND
CLOSING (TEXT / BINARY)

Opening :

FILE * file //Declare a pointer of type FILE


file=fopen(char *name, char *mode)

read
write/create
append (add to the end of file) Opening mode
Action
read Text Binary
Read "r" "rb"
write Write "w" "wb"

append Write at the end (append) "a" "ab"


Read/Write "r+" "rb+"
Return:
Read/Write (creation) "w+" "wb+"

❑ FILE * If everything goes well (Access to data) Read/Write at the fin "a+" "ab+"
❑ NULL If error (ex: inexistent file, Absence of access rights . . . )
FILES IN THE C LANGUAGE: OPENING AND
CLOSING (TEXT / BINARY)
Closing :
//fclose(FILE *file)
❑ Disconnect the file pointer from the physical file.
fclose(FILE *file)

Example Open/ close:


Text Binary
1 #include <stdio.h> 1 #include <stdio.h>
2 2
3 int main(){ Inaccessible: 3 int main(){
4 FILE *file = fopen( "Patient.txt", "r"); 4 FILE *file = fopen("Patient.dat", "rb");
5 if (file == NULL) file=NULL 5 if (file == NULL)
6 printf("Patient.txt inaccessible: 6 printf("Patient.dat inaccessible:
file=%p\n",file); file=%p\n",file);
7 else 7 else
8 printf("Patient.txt accessible: accessible: 8 printf(" Patient.dat accessible:
file=%p\n",file); file=%p\n",file);
9 if (file != NULL)
file=0x1d23050 9 if (file != NULL)
10 { close : file=NULL 10 {
11 printf(« close Patient\n"); 11 printf(« Close Patient\n");
12 fclose(file); 12 fclose(file);
13 } 13 }
46
FILES IN THE C LANGUAGE: OPENING AND CLOSING 1. #include <stdio.h>
(TEXT FILES) 2. int main() {

3. FILE *file = fopen("caractere.txt", "w");


4. if (file == NULL) {
5. printf( " File loading fails");
By Character: fgetc / fputc 6. return 1;
7. }
8. // Write one character ‘A’ in file using the fputc
fgetc 9. res= fputc('A', file);
10. if (res != EOF) {
❑ Used to read one character at a time from a 11. printf( "the character have been successfully written in the file.\n");
file. 12. }
13. else {
14. printf(« Error while writting in the file");
❑ Returns the integer representing the 15. }
character read or the constant 'EOF' in 16. fclose(fichier);
17. file = fopen("caractere.txt", "r");
case of end of file or read error.
18. if (file == NULL) {
19. printf(" File loading fails ");
20. return 1;
fputc 21. }
22. // read and print the content character per character using getc
23. char character = fgetc(fichier);
24. if (character != EOF) {
❑ Writes the character c at the current position 25. printf(" char : %c\n", caractere);
of the file f. 26. } else {
❑ In case of error, the function returns the 27. printf(" End of file achievec.\n");
28. }
constant EOF. 29. fclose(file);
30. return 0;
31. }

47
FILES IN THE C LANGUAGE:
OPENING AND CLOSING (TEXT
FILES)
// Exemple: Lire un fichier et le recopier dans un autres fichiers
1. #include <stdio.h>
By line: fgets / fputs 2. #include <stdlib.h>
fgets: 3. int main() {
char *fgets(char * str, int size, FILE *file) 4. FILE *source = fopen("source.txt", "r");
5. if (source == NULL) {
❑ Used to read a complete line of text from a 6. printf("Error while loading the file");
file. 7. return 1;
8. }
❑ It takes three main arguments :
9. // Open the file in writing mode
▪ str : A pointer to a character array where
10. FILE *destination = fopen("destination.txt", "w");
the read line will be stored.
11. if (destination == NULL) {
▪ size : The maximum length of the line to
12. printf("Error while loading the file");
read.
13. fclose(source);
▪ file : A pointer to the file.
14. return 1; }
15. char line[2048]; // a table to store the retrieved line from the file
16. while (fgets(line, sizeof(ligne), source) != NULL) {
fputs: 17. // if the line not empty, write it in the file
18. if (line[0] != '\n') {
int fputs(const char * str, FILE *file);
19. fputs(ligne, destination);
20. }
❑ Used to write a whole line of text to a file 21. }
22.
23. fclose(source);
24. fclose(destination); return 0;
25. }
48
OPENING AND CLOSING (TEXT
FILES) #include <stdio.h>
struct Patient {
char name[100];
int age;
Frormatted R/W Formater : fscanf / char diagnostic[100]; };
struct Patient patients[3];
fprintf
fscanf: int main() {
fscanf(FILE *file, const char *format, ...); FILE *file = fopen("patients.txt", "w");
if (fichier == NULL) {
❑ Used to read formatted data from a text file printf ("Erreur lors de l'ouverture du fichier"); return 1; }
for (int i = 0; i < 3; i++) {// fprintf to write the patient data into the file
fprintf(file, "Name: %s\n", patients[i].name);
fprintf(file, "Age: %d\n", patients[i].age);
❑ It takes two main arguments : fprintf(file, "Diagnostic: %s\n", patients[i].diagnostic);
▪ file : A pointer to a character array the file fprintf(file, "\n"); } // an empty line to separate the patients’ records
text. fclose(fichier);
▪ format : A format string specifying how the file = fopen("patients.txt", "r");
data should be read (%d, %s, %f, ....) if (fichier == NULL) {
printf("Error while loading the file"); return 1; }
for (int i = 0; i < 3; i++) {// fscanf to read and print the data from the file to the screen
fscanf(file, "Name: %s\n", patients[i].name);
fprintf: fscanf(file, "Age: %d\n", patients[i].age);
fprintf(FILE *fichier, const char *format, ...); fscanf(file, "Diagnostic: %s\n", patients[i].diagnostic);
fgetc(fichier); // read the empty line that exists at the end of each record
// print the data
❑ Used to read formatted data to a text file printf("Nam: %s\n", patients[i].name);
printf("Age: %d\n", patients[i].age);
printf("Diagnostic: %s\n", patients[i].diagnostic);
}
fclose(fichier);
return 0;}
49
FILES IN THE C LANGUAGE: READING / WRITING (BINARY FILE)
❑ fread and fwrite are the two standard C library functions needed to read and write binary data from/to
binary files

❑ These functions are particularly useful for manipulating raw data, such as structures or arrays, and
storing or extracting them from binary files.

fread : fwrite :
fread(void *ptr, size_t size, size_t count, FILE fwrite(const void *ptr, size_t size, size_t count, FILE
*file); *file);

❑ ptr : A pointer to the memory area where the read data ❑ ptr : A pointer to the memory area where the data will be stored.
are stored.
❑ size : The size in bytes of each element to be read. ❑ size : The size in bytes of each element to be written.
❑ count : The total number of elements to be read. ❑ count : The total number of elements to be written .
❑ file : A pointer to the binary file from which to read the ❑ file : A pointer to the binary file in which to write the data.
data.

50
FILES IN THE C LANGUAGE: READING / WRITING (BINARY
FILE)

1. #include <stdio.h>
2. // Record to represent a patient
3. struct Patient {
4. char name[100]; 1. // Open binary file for reading
2. file = fopen("patients.dat", "rb");
5. int age;
3. if (file == NULL) {
6. char diagnostic[100]; }; 4. printf("Error while loading the file"); return 1; }
7. struct Patient patients[3]; // create a table of patients 5. printf("Patients data retireved from the files :\n");
8. int main() { 6. // read the data from the binary file
9. //Fill the table with the patients’data 7. size_t elements = fread(patients, sizeof(struct Patient), 3, file);
10. struct Patient patient1 = {"Ahmed Benarab", 35, « Flue"}; 8. if (elements != 3) {
11. struct Patient patient2 = {"Leila slimani", 28, « Covid19"}; 9. printf (" Error while loading data from the file s");
12. struct Patient patient3 = {" Laamri Moustfaoui", 82, " Parkinson"}; 10. } else
patients[0] = patient1; patients[1] = patient2; patients[2] = patient3; 11. { for (int i = 0; i < 3; i++) {
13. // Open binary file for writting 12. printf("Patient %d:\n", i + 1);
14. FILE *file = fopen("patients.dat", "wb"); 13. printf("Name: %s\n", patients[i].name);
15. if (file == NULL) { 14. printf("Age: %d\n", patients[i].age);
15. printf("Diagnostic: %s\n", patients[i].diagnostic);
16. printf("Error while loading the file"); return 1; }
16. printf("\n"); }
17. // Write data in the file 17. }
18. size_t element = fwrite(patients, sizeof(struct Patient), 3, file); 18. fclose(file);
19. if (elements != 3) 19. return 0;
20. { printf(« Error while uploading data in the file "); 20. }
21. }
22. else {
23. printf(« Success.\n"); }
24. fclose(file);
51
FILES IN THE C LANGUAGE: CHANGING POSITION (BINARY FILE)

fseek :
1. #include <stdio.h>
int fseek(FILE *file, long offset, int origin);
2. int main() {
3. FILE *file = fopen("data.bin", "rb");
4. if (fichier == NULL) {
❑ The function fseek() in the C language is used to move the 5. printf("Error while loading the file"); return 1; }
6. // Move the pointe with 2 Bytes from the actual position
file position pointer within a binary file. It allows you to specify 7. int val= fseek(file, 2, SEEK_CUR);
the new position for reading or writing within the file 8. If(val!=0) {
❑ Parameters: 9. printf("Error while moving the pointer to the indicated
▪ File : A pointer to the binary file. position");
10. fclose(file); return 1; }
▪ Offset : The amount of movement, in bytes, relative to the
11. // At this point, the file position pointer has been moved 2
specified origin. bytes from its initial position.
▪ Origine : The origin from which to perform the movement. This 12. if (feof(file)) { // Test if we reached the end of the file
can be one of the following constants : 13. printf("\nEnd of file .\n"); }
▪ SEEK_SET : Move from the beginning of the file.. 14. else {
▪ SEEK_CUR : Move from the current position of the pointer. 15. printf("\nThe pointer is not at the end of file.\n");
▪ SEEK_END : Move from the end of the file. 16. }
17. fclose(file);
18. return 0;
19. }
Note: feof(File * file) allows us to test if we have reached the
end of the file 52
FILES IN C LANGUAGE: RECAPITULATIVE

Function Description Function Signature


fopen Opens a file and returns a file pointer to it. FILE *fopen(const char *filename, const char *mode);
feof Checks if the file pointer has reached the end of the file. int feof(FILE *file);
fclose Closes a previously opened file. int fclose(FILE *file);
fgetc Reads a single character from the file pointed to by the file pointer. int fgetc(FILE *file);
fputc Writes a single character to the file pointed to by the file pointer. int fputc(int character, FILE *file);

fgets Reads a line of text (up to a newline character) from the file. char *fgets(char *string, int size, FILE *file);

fputs Writes a string of characters to the file. int fputs(const char *string, FILE *file);
fscanf Reads formatted values from the file. int fscanf(FILE *file, const char *format, ...);
fprintf Writes formatted values to the file. int fprintf(FILE *file, const char *format, ...);
fread Reads a block of binary data from the file. size_t fread(void *ptr, size_t size, size_t count, FILE *file);

fwrite Writes a block of binary data to the file. size_t fwrite(const void *ptr, size_t size, size_t count, FILE *file);

fseek Moves the file position pointer within the file. int fseek(FILE *file, long offset, int origin);

53

You might also like