Types of file systems. File operations. Catalogs. Operations with directories. (5). File systems. File system structure

File system structure. File access mechanism.

File (file) - a named collection of data. It is possible to perform operations with files as a single whole using the operators: open(open), close(close), create(create), destroy(destroy), copy(copy), rename(rename), output(list). In addition, operations on individual file components are possible: read(read), write(write), update(update), insert(insert), exclude(delete).

File organization

File organization refers to the way records are arranged in external memory. There are the following methods of organization.

· Sequential - records are arranged in physical order, i.e. the “next” record is a record that physically follows the previous one; here records can be either fixed length or variable.

Records fixed

length

Variable entries

length

Record Length Indicators

· Index-sequential - records are arranged in a logical sequence according to the values of the keys contained in each record. Index-sequential records can be accessed sequentially, in ascending/descending order of key values, or directly by key, by searching the system index.

https://pandia.ru/text/78/277/images/image012_9.gif" height="108 src=">.gif" width="214">

· Direct - records are accessed randomly at their physical addresses on the storage device direct access.

· Library - it is essentially a file consisting of successive subfiles, where each successive subfile is called an element, or member of the file. The starting address of each such element is stored in the file directory. Library (partitioned) files are most often used to store program libraries or macro libraries.

Access Methods

Operating systems typically implement various methods access to files, which can be grouped into two categories:

· access methods with queues;

· basic access methods.

Access methods with queues are used in cases where the sequence of record processing can be predicted, for example, in sequential and index-sequential organizations. These methods provide preemptive buffering and scheduling of I/O operations. In addition, these methods provide automatic locking and release of records.

Basic access methods are usually used in cases where the sequence of record processing cannot be predicted, in particular with direct or random access. Basic methods read and write physical blocks, blocking and unblocking, if necessary, is determined by the user himself.

File characteristics

· Variability- indicates the frequency of making new entries in the file and deleting old ones. When the frequency is low, the file is called static, and when it’s big - dynamic or changeable file.

· Activity- determined by the percentage of file records processed during a given run.

· Size- determines the amount of information stored in the file.

File system

File system- this is part common system memory management (see Structure of the OS kernel), the purpose of which is mainly to manage files stored in external memory, as well as to controlled division of information between users.

File system functions

· providing the ability to create, modify, destroy files;

· controlled sharing of files by several users;

· providing the user with the ability to specify different file structures and the ability to control the transfer of information between files;

· the system must provide means for ensuring the safety and recovery of information in files;

The system must ensure the independence of files from external devices, i.e., users should be able to access files using symbolic names;

· the system must provide protection of information in files from unauthorized access (the ability to encrypt and decrypt data);

· The file system must have a user-friendly interface.

File system composition

The file system that is part of the OS kernel usually contains the following tools:

· Access Methods, which determine the specific organization of access to data stored in files.

· File management tools, providing file storage, access, collective use and protection.

· External memory management tools, providing allocation of external memory space for storing files.

· File integrity tools, which guarantee the safety of file information.

Placing files on disk memory

Placing files in disk memory much like memory allocation in variable partition multiprogramming. Note that during system operation, disk space is subject to fragmentation, and therefore files have to be placed in scattered blocks. Obviously, it is possible to use the “garbage collection” method we have already discussed, but this is not always effective.

Cohesive Memory Allocation

1 Free

5 Free

Cohesive Memory Allocation assumes that each file is allocated one contiguous area of external memory. One of the advantages of this method is that sequential logical records are placed, as a rule, physically nearby, which allows for increased access speed. In this case, it is quite simple to implement directories, since for each file it is necessary to store only the starting address and the length of the file. The disadvantage of this approach to memory allocation is that after files are destroyed and the resource they occupy is returned to the free list, newly allocated files must fit into existing free areas. Thus, here we are faced with the same problems as with fragmentation in multiprogramming systems with variable partitions - the need to combine free adjacent memory areas. In addition, when working with dynamically changing file sizes, this method may not be efficient.

Disjointed memory allocation

Allocation using sector lists

In this case, memory is considered as a set of individual sectors. Files consist of sectors, which can be located in various places in external memory. Sectors belonging to the same file contain pointer links to each other, forming a list. On the list free space contains all free sectors of external memory.

2 Free

4 Free

If it is necessary to increase the file size, the corresponding process requests an additional number of sectors from among the free ones, and when the size decreases, the freed sectors are returned to the free list. This way, the need for memory compaction is avoided.

The disadvantage of this method of memory allocation is the increase in overhead costs for creating a mechanism for processing pointer references, as well as a possible increase in access time.

Block-based memory allocation

The block-by-block distribution option combines elements of connected and disconnected distribution; in this case, memory is distributed not by individual sectors, but by blocks of adjacent sectors, and when allocating new blocks, the system strives to select free blocks as close as possible to existing blocks of the file. Each time a file is accessed, the corresponding block is first determined, and then the corresponding sector within that block is determined.

Implementation of block-based memory allocation can be done using block chains, index block chains And display tables.

Block chain

Catalog

https://pandia.ru/text/78/277/images/image022_2.gif" width="108" height="21">.gif" width="166" height="70">

Gif" width="51" height="12"> File Location Data Data DataNil

In a block chain scheme, a line in the directory points to the first block of the file, then each fixed-length block included in the file contains two parts: the data itself and a pointer to the next block. The minimum allocated unit of memory is a fixed-size block.

Obviously, to find a specific record in a file, you need to look through the chain, find the corresponding block, and then the required record in the block. Since blocks may be scattered throughout the disk, this process can take a long time. To reduce access time, chains can be made with bidirectional links, which makes it possible to view the chain in both directions.

Index block chain

Catalog

https://pandia.ru/text/78/277/images/image028_3.gif" width="166" height="2 src="> File Location

Chain index blocks

https://pandia.ru/text/78/277/images/image033_0.gif" width="166" height="165 src=">left">

https://pandia.ru/text/78/277/images/image039_0.gif" width="108" height="21">

https://pandia.ru/text/78/277/images/image028_3.gif" width="166" height="2 src="> File Location Block 6 A(2)

In a block mapping table design, block numbers are used instead of pointers. Usually numbers can easily be converted to actual addresses. A file mapping table is used, which contains one row per disk block. The row in the user directory points to the row in the mapping table corresponding to the first block this file. Each row of the mapping table contains the number of the next block of the given file. Thus, you can find all blocks of a file sequentially by looking through the rows of the file mapping table. Those table rows that correspond to the last blocks of files are usually set to a null pointer Nil. In some rows of the table the sign “free” is indicated, indicating that this block can be allocated upon subsequent request.

The main advantage of such a scheme is that the physical proximity of blocks can be judged from the file mapping table.

File system structure

The file system structure depends on operating system. One of the first computers to use file-based FAT system(File Allocation Table), which was used in the MS DOS operating system.

FAT was designed to work with floppy disks smaller than 1 MB, and initially did not provide support hard drives. Subsequently, FAT began to support files and partitions up to 2 GB in size.

FAT uses the following file naming conventions:
the name must begin with a letter or number and can contain any ASCII character, except for space and the characters "/\ : ; | = , ^ * ?
The name is no more than 8 characters long, followed by a period and an optional extension of up to 3 characters.
The case of characters in file names is not distinguished and is not preserved.

The FAT file system cannot control each sector separately, so it groups adjacent sectors into clusters. This reduces the total number of storage units that the file system must keep track of. The cluster size in FAT is a power of two and is determined by the size of the volume when formatting the disk. A cluster represents the minimum amount of space that a file can occupy. This results in some of the disk space being wasted.

In operating systems, the concepts of directory and folder are used as objects designed to store files and provide access to them.

Access is a procedure for establishing communication with memory and a file located in it for writing and reading data.

When accessing a file, you must specify its exact location. Moreover, if the file is accessed from command line, then the entry looks like this:

c:\Papka1\papka2\uchebnik.doc

Such a record is called a route, or path.

The logical drive name that appears before the file name in the specification specifies the logical drive on which to search for the file. On the same disk there is a directory in which the full names of the files are stored, as well as their characteristics: date and time of creation; volume (in bytes); special attributes. Similar to a library cataloging system full name file registered in the directory will serve as a cipher by which the operating system finds the location of the file on the disk.

Directory is a directory of files indicating their location on the disk.

In the WINDOWS operating system, the concept of directory corresponds to the concept of folder.

There are two directory states - current (active) and passive.

Current (active) catalog- catalog, in which in at the moment time the user is working.

Passive directory - a directory with which there is currently no connection .

The operating system adopts a hierarchical directory structure. Each disk always has a single main (root) directory. It is located at the zero level of the hierarchical structure and is denoted by the symbol "\" - backslash. The root directory is created when formatting (initializing, partitioning) the disk and has a limited size. The main directory may include other directories and files that are created by operating system commands and can be deleted by appropriate commands.

Parent directory - a directory with subdirectories .

Subdirectory - a directory that is included in another directory .

Thus, any directory containing lower-level directories can be, on the one hand, parent to them, and on the other hand, subordinate to the top-level directory.

The directory structure may contain directories that do not contain any files or subdirectories. Such subdirectories are called empty .

The rules for naming subdirectories are the same as the rules for naming files. To formally distinguish them from files, subdirectories are usually assigned only names, although a type can be added using the same rules as for files.

The FAT file system always fills free space on the disk sequentially from beginning to end. When creating a new file or modifying an existing one, it looks for the very first free cluster in the file allocation table. If during operation some files were deleted and others changed in size, then the resulting empty clusters will be scattered across the disk. If the clusters containing the file data are not located in a row, then the file becomes fragmented. Heavily fragmented files significantly reduce work efficiency. Operating systems that support FAT usually include special utilities Disk defragmentation, designed to improve the performance of file operations.

The FAT file system has a significant limitation in supporting large volumes disk space, the limit is 2 GB.

New generations of hard drives with large amounts of disk space required a more advanced file system.

The Windows operating system contains the FAT32 file system, which supports hard drives up to two terabytes.
FAT32 has expanded file attributes to store the time and date of creation, modification, and last access of a file or directory.
The system allows long names files and spaces in names.
The FAT32 file system is supported on Windows XP and Windows Vista operating systems.

Another file system was developed for these operating systems: NTFS (New Technology File System)

NTFS has significantly expanded the ability to control access to separate files and directories, a large number of attributes have been introduced, fault tolerance has been implemented, and dynamic file compression tools have been implemented. NTFS allows file names up to 255 characters long

NTFS has the ability self-recovery in case of OS or hardware failure, so that the disk volume remains available and the directory structure is not disrupted.

Each file on an NTFS volume is represented by an entry in special file– the main file table MFT (Master File Table). NTFS reserves the first 16 table entries, about 1 MB in size, for special information. Records provide backup of the main file table, file recovery, control the state of clusters, and determine file attributes.

To reduce fragmentation, NTFS always tries to store files in contiguous blocks. She provides efficient search files in the directory.

NTFS was designed as a recoverable file system using a transaction processing model. Each I/O operation that modifies a file on an NTFS volume is considered a transaction by the system and can be executed as an indivisible block. When a file is modified by a user, the log file service records all the information necessary to repeat or roll back the transaction.

An interesting feature of the file system is dynamic encryption of files and directories, which increases the reliability of information storage.

Self-test questions.

1.What is a file system?

2. What is a "file"?

3. Main components of the file structure.

4. What is a cluster?

5.Name the main parameters characterizing the file.

6.How is the file name formed?

7. Rules for naming files in the FAT system.

8.Why is there a need to defragment the disk?

9. What is a directory?

10. Explain the concepts of “route”, “path”.

11.Why is the extension used in file names?

12.The main purpose of the file system.

13.What file systems are supported by operating systems Windows systems XP, Windows Vista?

File- logical memory allocation unit. It is also a collection of logically interrelated information. File system located in external memory (on disks) and organized By levels. The structure of a multi-level file system is shown in Fig. 11.19.

Rice. 11.19. Multi-level file system.

On upper level abstractions work user programs that use high-level primitives of the form WriteLine(F, X).The level below is the interface modules logical files– logical records, blocks and exchange operations. Further down come the file organization modules, then - operations basic system files. Drivers are located at the lower levels devices (control input/output) and hardware (input/output devices and their controllers).

– a structure in memory containing information about the file. Typical The structure of the file control block is presented in Table 3.

In-memory system structures for file system management

When you open a file and when you further perform operations on it, the OS stores in memory whole a number of system structures shown in Fig. 19.12.

Rice. 19.12. OS structures in memory for file system management.

When opening a file, when executing operations, where it is indicated access path to a file in the directory structure, the system finds a link to the file control block. When performing exchange operations, the OS reads in memory file data blocks that are executed on operations. In addition, the OS maintains a system-wide table open files. For each process it is also stored table files opened only by this process.

Key terms

Network File System (NFS)- widely used system public access to files via local net.

Absolute path- full access path to the file, starting from the logical partition name, or from the root system directory.

File attributes– general properties, describing the contents of the file.

Block– logical unit information (part) of a file, usually combining several records, for the purpose of optimization I/O operations.

File control block (FCB)– a structure in memory containing information about a file and used by the operating system.

Directory (directory, folder)- directory, folder– a structure in external memory containing symbolic names of files and other directories and links to them.

Object Code File Addendum (DOFK):in the Elbrus system - file, containing in a unified form tables of named entities defined in the program and its procedures ( metadata).

File header – head record file that contains it attributes.

Record – elementary unit, part of the file in terms of which the operations sharing with a file.

Protection– manager information, which specifies the permissions to read, modify, and execute the file.

Container(in the Elbrus system) – file storage on one or several disks.

Mounting– connection of a separate subtree not yet mounted file system to any vertex (mount point) a common tree of available file systems.

Data set- company term IBM to indicate file.

Sharing– the ability to access files and directories for various users, including – By local network.

Relative path- access path to a file regarding some current directory.

File memory - its records containing the information actually stored in it.

Path– multi-syllable file name or directory, consisting of the name of the root directory (or logical drive) and a sequence of directory names of subsequent levels.

Partition – an adjacent area of disk memory that has its own logical name (usually one of the first letters of the Latin alphabet).

Backup(back-up)– copying files and directories on external media– tape ( streamer), flash- memory, external portable hard disk, compact disk (CD, DVD), for the purpose of their safety.

Directory external links(CBC)– in the Elbrus system: directory, available for each file and used to store its external links to other files; SVS elements are addressed By numbers, not By names.

Mount point– a node in the file system tree to which a new one is attached file system at mounting.

File – a contiguous region of logical address space, typically stored in external memory.

Object Code File (FOK)– in the Elbrus system: file, in which it is stored binary code executable program.

File system– subtree directories on some machine, located in one section.

Questions

1. What is a file?

2. Which one type information can it be stored in a file?

3. What structure can the file have?

4. What programs interpret the contents of the file?

5. What are the main file attributes?

6. What are the basic operations on a file?

7. How does the system determine the file type?

8. What name extensions are used in operating systems?

9. What methods of accessing files do you know?

10. What operations are defined on direct access files?

11. What operations are defined on files sequential access?

12. What is an index file and what is it used for?

13. What is a directory?

14. What are the features, advantages and disadvantages of the Elbrus file system?

15. What is a section?

16. What are the basic operations on a directory?

17. What are the purposes of logical directory organization?

18. Which directory organization is most preferable and why?

19. What problems arise when organizing directories in the form of an arbitrary graph?

20. What is mounting file systems?

21. What is a mount point?

22. What is general access to files and why is it necessary?

23. What is NFS?

24. What is file protection?

25. What security permissions and for which users are considered in UNIX?

26. What is a file control block?

27. What levels of abstraction can be distinguished in the implementation of file systems?

28. What structures in memory does the OS create when opening a file and to manage exchange operations?

Exercises

1. Implement a set of basic file operations using low-level I/O primitives.

2. Implement operations sequential access to files using direct access operations.

3. Implement index files and accelerated search operations for information on main files using index files.

4. Implement the directory structure and basic operations on it using file operations. Store all links in symbolic form.

5. Develop and implement an algorithm for finding cyclic links in the directory structure.

The file system is usually located on disks or other external storage devices that have a block structure. In addition to blocks storing directories and files, several more service areas are supported in external memory.

In the UNIX world there are several different types file systems with their own external memory structure. The most famous are traditional file UNIX system System V (s5) and the UNIX BSD family file system (ufs). The s5 file system consists of four sections (Figure 2.2,a). In the ufs file system on logical drive(partition of a real disk) there is a sequence of file system sections (Figure 2.2,b).

Rice. 2.2. Structure of external memory of s5 and ufs file systems

Let us briefly describe the essence and purpose of each disk area.

The boot block contains a promotion program that is used to initially launch the UNIX OS. In s5 file systems, only the boot block of the root file system is actually used. In additional file systems, this area is present, but not used.
The superblock is the most critical area of the file system, containing information that is necessary to work with the file system as a whole. The superblock contains a list of free blocks and free i-nodes (information nodes). In ufs file systems, to increase stability, multiple copies of the superblock are supported (as can be seen from Figure 2.2, b, one copy per group of cylinders). Each copy of a superblock is 8196 bytes in size, and only one copy of a superblock is used when mounting the file system (see below). However, if during mounting it is determined that the primary copy of the superblock is damaged or does not meet information integrity criteria, it is used backup.
A cylinder group block contains the number of i-nodes specified in the i-node list for a given cylinder group, and the number of data blocks that are associated with these i-nodes. The block size of a cylinder group depends on the size of the file system. To improve efficiency, the ufs file system tries to place i-nodes and data blocks in the same cylinder group.
The i-node list (ilist) contains a list of i-nodes corresponding to files in a given file system. The maximum number of files that can be created in the file system is determined by the number of available i-nodes. The i-node stores information describing the file: file access modes, time of creation and last modification, user ID and group ID of the file creator, description of the file’s block structure, etc.
Data blocks - This part of the file system stores the actual file data. In the case of the ufs file system, all data blocks of one file are tried to be placed in one group of cylinders. The data block size is determined when formatting the file system with the mkfs command and can be set to 512, 1024, 2048, 4096 or 8192 bytes.

The operating system, which is the basis for the operation of any computer equipment, organizes work with electronic data, following a certain algorithm, in the chain of which the file system is not unclaimed. What a file system is in general, and what types of it are applicable in modern times, we will try to explain in this article.

Description general characteristics file system

FS is, as stated above, a part of the operating system that is directly related to placement, deletion, movement electronic information on a specific medium, as well as the safety of its further use in the future. This resource is also applicable in cases where it is necessary to restore lost information due to a software failure as such. That is, it is the main tool for working with electronic files.

Types of file system

On every computer device a special type of FS is applicable. The following types are particularly common:

- designed for hard drives;
— intended for magnetic tapes;
— intended for optical media;
— virtual;
- network.

Naturally, the main logical unit of working with electronic data is a file, which means a document with information of a certain nature systematized in it, which has its own name, which makes it easier for the user to work with a large flow of electronic documents.
So, absolutely everything used by the operating system is transformed into files, regardless of whether it is text or images, or sound, or video, or photos. In addition to everything else, drivers and software libraries also have transcriptions of them.

Each information unit has a name, a specific extension, size, inherent characteristics, and type. But the FS is their totality, as well as the principles of working with all of them.

Depending on what specific features are inherent in the system, it will work effectively with such data. And this is a prerequisite for classifying it into types and types.

A look at the file system from a programming perspective

When studying the concept of a file system, you should understand that this is a multi-level component, the first of which is dominated by a file system transformer, which ensures effective interaction between the system itself and a specific software application. It is he who is responsible for converting the request for electronic data into a specific format, which is recognized by the drivers, which entails effective work with files, that is, access to them is opened.

U modern applications, which have a client-server standard, the requirements for the FS are very high. After all modern systems are simply obliged to provide the most effective access to all available types of electronic units, as well as provide tremendous support for large-volume media, as well as establish protection of all data from unwanted access by other users, as well as ensure the integrity of information stored in electronic format.

Below we will look at all the existing FSs and their advantages and disadvantages.

FS - FAT

This is the oldest type of file system, which was developed back in 1977. It worked with OS 86-DOS and is not capable of working with hard storage media, and is designed for flexible types and storage of information up to one megabyte. If limiting the size of information is not relevant today, then other indicators remain in demand unchanged.

This file system was used by a leading developer company software applications– Microsoft for such operating systems as MS-DOS 1.0.
The files of this system have a number of characteristic properties:

— the name of the information unit must contain a letter or number at the beginning, and further contents of the name may include various computer keyboard symbols;
— the file name must not exceed eight characters; a dot is placed at the end of the name, followed by a three-letter extension;
— any keyboard layout register can be used to create a file name.

From the very beginning of its development, the FAT file system was aimed at working with the DOS operating system; it was not interested in saving data about the user or owner of the information.

Thanks to various modifications of this FS, it has become the most popular in modern times and the most innovative operating systems operate on its basis.

It is this file system that is capable of saving files unchanged if computer equipment is turned off incorrectly due to, for example, the battery not being charged or the lights being turned off.

Many operating systems with which FAT works contain certain software utilities that correct and check the file system content tree and files itself.

FS - NTFS

A modern file system works with the Windows NT operating system. NTFS system, in principle, it was aimed at her. It includes the convert utility, which is responsible for converting volumes from HPFS or FAT format to NTFS volume format.

It is more modernized compared to the first option described above. This version has expanded the capabilities regarding direct access control to all information units. Here you can use many useful attributes, dynamic file compression, and fault tolerance. One of its advantages is its support for the requirements of the POSIX standard.

This file system allows you to create information files with names up to 255 characters long.

If the operating system that works with this file system fails, then there is no need to worry about the safety of all files. They remain intact and unharmed, since this type of file system has the property of self-healing.

A feature of the NTFS file system is its structure, which is presented in the form of a specific table. The first sixteen entries in the registry are the contents of the file system itself. Each individual electronic unit also has the form of a table, which contains information about the table, mirror file in MFT format, a registration file used when it is necessary to restore information and subsequent data - this is information about the file itself and its data that was saved directly on the hard drive.

All executed commands with files tend to be saved, which helps the system subsequently recover on its own after a failure of the operating system with which it is working.

FS - EFS

A very common file system is EFS, which is considered encrypted. It works with the Windows operating system. This system causes files to be stored on the hard drive in encrypted form. This is the most effective protection for all files.
Encryption is set in the file properties using a checkbox next to the tab indicating the possibility of encryption. Using this function, you can specify who can view files, that is, who is allowed to work with them.

FS – RAW

File elements are the most vulnerable units of programming. After all, they are the information that is stored on computer disks. They can be damaged, removed, hidden. In general, the user's work is solely aimed at creating, saving and moving them.
The operating system does not always show the ideal properties of its work and has a tendency to fail. This happens for many reasons. But that’s not about that now.

Many users are faced with a notification that their RAW system. Is this really FS or not? Many people ask this question. It turns out that this is not entirely true. If we explain it at the level of a programming language, then RAW is an error, namely a logical error that has already been introduced into the Windows operating system in order to protect it from failure. If the equipment gives any messages about RAW, then you need to keep in mind that the structure of the file system is at risk, it is not working correctly or is in danger of gradual destruction.

If such a problem is obvious, then you will not be able to access a single file on the computer, and it will also refuse to execute other operational commands.

FS – UDF

This is the file system for optical disks, which has its own characteristics:

— file names should not exceed 255 characters;
— the nominal case can be either lower or upper.

It works with the Windows XP operating system.

FS - EXFAT

And another modern file system is EXFAT, which is a kind of intermediary between Windows and Linux, ensuring the effective transformation of files from one system to another, since their file hosting services are different. It is used on portable storage devices, such as flash drives.

From what has been written above, we can draw the right conclusion. Each characterized file system has its own characteristics and creates certain file formats. This is why sometimes you can’t access some files, which means they were created in a completely different file system that yours cannot recognize.
We hope that the information presented in this article will help you avoid many problems when working with information files. Now you can independently determine which FS your computer’s OS works with, and what data you have to work with daily in the flow of their systematic operational processing.