A file is a collection of data stored on a secondary storage device like hard disk. Till now, we had been processing data that was entered through the computer's keyboard. But this task can become very tedious especially when there is a huge amount of data to be processed.
Unit V : Files
CHAPHTER 10 : FILES
Takeaways
•
Streams in C
•
Error handling
•
Renaming files
•
Reading data from files
•
Command line arguments
•
Creating temporary files
•
Writing data to files
•
Random access of data
INTRODUCTION
TO FILES
A
file is a collection of data stored on a secondary storage device like hard
disk. Till now, we had been processing data that was entered through the
computer's keyboard. But this task can become very tedious especially when
there is a huge amount of data to be processed. A better solution, therefore,
is to combine all the input data into a file and then design a C program to
read this data from the file whenever required.
Broadly
speaking, a file is basically used because real- life applications involve
large amounts of data and in such applications the console-oriented I/O
operations pose two major problems:
•
First, it becomes cumbersome and time-consuming to handle huge amount of data
through terminals.
•
Second, when doing I/O using terminal, the entire data is lost when either the
program is terminated or computer is turned off. Therefore, it becomes
necessary to store data on a permanent storage device (the disks) and read
whenever necessary, without destroying the data.
In
order to use files, we have to learn file input and output operations, i.e.,
how data is read or written to a file.
Although
file I/O operations is almost same as terminal I/O, the only difference is that
when doing file I/O, the user must specify the name of the file from which data
should be read/written.
In
C, the standard streams are termed as pre-connected input and output channels
between a text terminal and the program (when it begins execution). Therefore,
stream is a logical interface to the devices that are connected to the
computer.
Stream
is widely used as a logical interface to a file where a file can refer to a
disk file, the computer screen, keyboard, etc. Although files may differ in the
form and capabilities, all streams are the same. The three standard streams
(Figure 9.1) in C language are as follows:
•
standard input (stdin)
•
standard output (stdout) and
•
standard error (stderr).
Standard input (stdin) Standard
input is the stream from which the program receives its data. The program re-
quests transfer of data using the read operation. However, not all programs
require input. Generally, unless redirected, input for a program is expected
from the keyboard.
Standard output (stdout)
Standard output is the stream where a program writes its output data. The
program requests data transfer using the write operation. However, not all
programs generate output.
Standard error (stderr) Standard
error is basically an output stream used by programs to report error messages
or diagnostics. It is a stream independent of standard output and can be
redirected separately. No doubt, the standard output and standard error can
also be directed to the same destination.
A
stream is linked to a file using an open operation and dissociated from a file
using a close operation.
When
a stream linked to a disk file is created, a buffer is automatically created
and associated with the stream. A buffer is nothing but a block of memory that
is used for temporary storage of data that has to be read from or written to a
file.
Buffers
are needed because disk drives are block- oriented devices as they can operate
efficiently when data has to be read/written in blocks of certain size. An
ideal buffer size is hardware-dependent. oed
The
buffer acts as an interface between the stream (which is character-oriented)
and the disk hardware (which is block-oriented). When the program has to write
data to the stream, it is saved in the buffer till it is full. Then the entire
contents of the buffer are written to the disk as a block. This is shown in
Figure 9.2.
Similarly,
when reading data from a disk file, the data is read as a block from the file
and written into the buffer. The program reads data from the buffer. The creation
and operation of the buffer is automatically handled by the operating system.
However, C provides some functions for buffer manipulation. The data resides in
the buffer until the buffer is flushed or written to a file.
In
C, the types of files used can be broadly classified into two categories-ASCII
text files and binary files.
ASCII Text Files
A
text file is a stream of characters that can be sequentially processed by a
computer in forward direction. For this reason, a text file is usually opened
for only one kind of operation (reading, writing, or appending) at any given
time. Because text files only process characters, they can only read or write
data one character at a time. In C, a text stream is treated as a special kind
of file.
Depending
on the requirements of the operating system and on the operation that has to be
performed (read/write operation) on the file, newline characters may be
converted to or from carriage return/line feed combinations. Besides this, other
character conversions may also be done to satisfy the storage requirements of
the operating system. However, these conversions occur transparently to process
a text file.
In
a text file, each line contains zero or more characters and ends with one or
more characters that specify the end of line. Each line in a text file can have
maximum of 255 characters. A line in a text file is not a C string, so it is
not terminated by a null character. When data is written to a text file, each
newline character is converted to a carriage return/line feed character.
Similarly, when data is read from a text file, each carriage return/line feed
character is converted into newline character.
Programming Tip:
The
contents of a binary file are not human-readable. If you want the data stored
in the file to be human- readable, then store the data in a text file.
Another
important thing is that when a text file is used, there are actually two
representations of data-internal or external. For example, an int value will be
represented as 2 or 4 bytes of memory internally, but externally the int value
will be represented as a string of characters representing its decimal or
hexadecimal value. To convert internal representation into external, we can use
printf geland fprintf functions. Similarly, to convert an external
representation into internal scanf and fscanf can be used. We will read more
about three functions in the coming sections.
Note
In
a text file, each line of data ends with a newline character. Each file ends
with a special character called the end-of-file (EOF) marker.
Binary Files
A
binary file may contain any type of data, encoded in binary form for computer
storage and processing purposes. Like a text file, a binary file is a
collection of bytes. In C, a byte and a character are equivalent. Therefore, a
binary file is also referred to as a character stream with the following two
essential differences:
•
A binary file does not require any special processing of the data and each byte
of data is transferred to or from the disk unprocessed.
•
C places no constructs on the file, and it may be read bo from, or written to,
in any manner the programmer wants.
While
text files can be processed sequentially, binary files, on the other hand, can
be either processed sequentially or randomly depending on the needs of the
application. In C, to process a file randomly, the programmer must move the
current file position to an appropriate place in the file before reading or
writing data. For example, if a file is used to store records (using
structures) of students, then to update a particular record, the programmer
must first locate the appropriate record, read the record into memory, update
it, and finally write the record back to the disk at its appropriate location
in the file.
Note
Binary
files store data in the internal representation format. Therefore, an int value
will be stored in binary form as a 2 or 4 byte value. The same format is used
to store data in memory as well as in file. Like text file, binary file also
ends with an EOF marker.
In
a text file, an integer value 123 will be stored as a sequence of three
characters-1, 2, and 3. So each character will take 1 byte and therefore, to
store the integer value 123 23 we need 3 bytes. However, in a binary file, the
int value 123 will be stored in 2 bytes in the binary form. This clearly
indicates that binary files take less space to store the same piece of data and
eliminates conversion between internal and external representations and are
thus more efficient than the text files.
Programming in C: Unit V: File processing : Tag: : File processing in C Programming - Introduction to Files
Programming in C
CS3251 2nd Semester CSE Dept 2021 | Regulation | 2nd Semester CSE Dept 2021 Regulation