0% found this document useful (0 votes)
0 views

unit-1

The document provides an overview of file systems, focusing on the File Allocation Table (FAT) and New Technology File System (NTFS), detailing their structures, functionalities, and differences. It explains how FAT manages files through cluster allocation and tracking, while NTFS offers advanced features like file permissions, compression, and reliability. Additionally, the document discusses malware detection techniques, including signature-based and machine learning approaches, as well as malware analysis methods such as static and dynamic analysis.

Uploaded by

h66989862
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

unit-1

The document provides an overview of file systems, focusing on the File Allocation Table (FAT) and New Technology File System (NTFS), detailing their structures, functionalities, and differences. It explains how FAT manages files through cluster allocation and tracking, while NTFS offers advanced features like file permissions, compression, and reliability. Additionally, the document discusses malware detection techniques, including signature-based and machine learning approaches, as well as malware analysis methods such as static and dynamic analysis.

Uploaded by

h66989862
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Introduction to

File Systems

• File systems are the organized structures that manage


how data is stored and retrieved on storage devices.
Understanding the fundamentals of file systems is
crucial for effectively navigating and managing digital
information.
• It is an essential part of the file system that
helps to keep track of where files are stored
on a disk and how much space is available
for new files.
FAT • It keeps track of the location of each file on
the device by using a table that maps file
names to their physical location on the disk.
• It consists of a sequence of entries, with
each entry representing a cluster on the
disk. A cluster is a group of contiguous
sectors, which is the smallest unit of disk
space that can be allocated to a file. Each
entry in the FAT contains information about
the status of the corresponding cluster, such
as whether it is free or allocated to a file.
FAT The entries also contain pointers to the next
cluster in a file, allowing the FAT to keep
track of the sequence of clusters that make
up a file. The first entry in the FAT is reserved
for the root directory of the disk, while the
remaining entries are used for file and
directory clusters.
FAT12 and FAT16 have smaller maximum disk sizes and use shorter entry
sizes, while newer versions such as FAT32 can support larger disks and use
longer entry sizes to accommodate more clusters.

• FAT12 was the original version of the FAT file system, which was first introduced in
1980 with MSDOS. It was designed for small disks, with a maximum size of 16MB
and a cluster size of 512 bytes. FAT12 is no longer commonly used, but it can still
be found on some older devices such as digital cameras and music players.
• FAT16 was the next version of the FAT file system, which was introduced in 1984
with the release of MS-DOS 3.0. It supports larger disks than FAT12, with a maximum
size of 2GB and a cluster size of up to 64KB. FAT16 is still used on some devices, but
it is not as common as it used to be.
• FAT32 is the most recent version of the FAT file system, which was introduced in
1996 with the release of Windows 95 OSR2. It was designed to support larger disks
than FAT16, with a maximum size of 2TB and a cluster size of up to 32KB. FAT32 is
still widely used today, particularly on removable storage devices such as USB
drives and SD cards.
Explanation of how FAT manages files

• When a file is created or saved, the operating system allocates one or more
clusters to the file and updates the corresponding entries in the FAT to indicate
that these clusters are now in use. The first entry in the FAT is reserved for the root
directory of the disk, which contains a list of all the files and directories on the disk.
• To access a file, the operating system uses the FAT to find the first cluster of the file
and then follows the chain of clusters that make up the file, using the pointers in
the FAT entries to locate each subsequent cluster. When a file is deleted or
moved, the operating system marks the corresponding clusters in the FAT as free,
making them available for use by new files.
• The FAT also helps to manage available space on the disk by keeping track of
free clusters and allocating them to new files as needed. When a file is saved or
modified, the operating system checks the FAT to find a sequence of free clusters
that are large enough to hold the file and allocates them to the file.
• NTFS, which stands for NT file system and
the New Technology File System, is the file
system that the Windows NT operating system
(OS) uses for storing and retrieving files on hard
disk drives (HDDs) and solid-state drives
(SSDs). NTFS is the Windows NT equivalent of
NTFS the Windows 95 file allocation table (FAT) and
the OS/2 High Performance File System
(HPFS). However, NTFS offers several
improvements over FAT and HPFS in terms of
performance, extendibility and security.
NTFS features
One distinguishing characteristic of NTFS, compared with FAT, is that it
allows for file permissions and encryption. Notable features of NTFS
include the following:
• Organizational efficiency. NTFS uses a b-tree directory scheme to keep track of file
clusters. This is significant because it allows for efficient sorting and organization of
files.
• Accessible data. It stores data about a file's clusters and other data in the MFT,
not just in an overall governing table as with FAT.
• File size. NTFS supports very large files.
• User permissions. It has an access control list that lets a server administrator control
who can access specific files.
NTFS features
• Compression. Integrated file compression shrinks file sizes and provides more
storage space.
• Unicode file naming. Because it supports file names based on Unicode, NTFS has a
more natural file-naming convention and allows for longer file names with a wider
array of characters. Non-Unicode naming conventions sometimes require
translation
• Secure. NTFS provides security for data on removable and nonremovable disks.
• Requires less storage. It has support for sparse files that replaces empty
information -- long strings of zeros -- with metadata that takes up a smaller volume
of storage space.
• Easy volume access. NTFS uses mounted volumes, meaning disk volumes can be
accessed as normal folders in the file system.
Advantages of NTFS
• Control. One of the primary features of NTFS is the use of disk quotas, which gives
organizations more control over storage space. Administrators can use disk
quotas to limit the amount of storage space a given user can access.
• Performance. NTFS uses file compression, which shrinks file sizes, increasing file
transfer speeds and giving businesses more storage space to work with. It also
supports very large files.
• Security. The access control features of NTFS let administrators place permissions
on sensitive data, restricting access to certain users. It also supports encryption.
• Easy logging. The MFT logs and audits files on the drive, so administrators can
track files that have been deleted, added or changed in any way. NTFS is a
journaling file system, meaning it logs transactions in a file system journal.
Advantages of NFTS
• Reliability. Data and files can be quickly restored in the event of a
system failure or error, because NTFS maintains the consistency of
the file system. It is a fault tolerant system and has an MFT mirror file
that the system can reference if the first MFT gets corrupted.
Disadvantages of NTFS
• Limited OS compatibility. The main disadvantage of NTFS is limited OS
compatibility; it is read-only with non-Windows OSes.
• Limited device support. Many removable devices don't support NTFS, including
Android smartphones, DVD players and digital cameras. Some other devices
don't support it either, such as media players, smart TVs and printers.
• Mac OS X support. OS X devices have limited compatibility with NTFS drives; they
can read them but not write to them.
Parsing FAT/NTFS
file systems
File Allocation Table

• Microsoft first created the File Allocation Table, which is a File


System, in 1977 and is used in the present day. FAT was
predominantly used for devices such as floppy drive media but now
it used for more advanced storage media such as USB drives or SD
cards. FAT was primarily the main system used with Microsoft but
nowadays the main file system used is NTFS. FAT has been updated
thoroughly over time in order to cope with the demands of the
larger size of the hard disks and file sizes (Fisher, 2017).
File Allocation Table

There are many different versions of the FAT file systems:


• FAT 1 & 2 – These are the first copies of the FAT file system
• FAT12 – This file system uses the 12-bit cluster addressing.
• FAT16 – This file system uses the 16-bit cluster addressing.
• FAT32 – This file system uses the 16-bit cluster addressing; using
Windows 95and later (King, 2016).
LOOKING AT A FILE AND ITS FILE SLACK

File slack consists of the difference between the size of a physical file and the size of a logical
file. The physical size of a file is the size of the file that is stored on the hard drive and the
logical size of a file is the actual size of a file (Anon, 2008).
1. To find the length of a file, select the hex view; then simply scroll down to the bottom of
the hex pane related to the file and click on the last byte. Alternatively, you can click the
‘Properties’ pane on the left hand side and the length of the file will be listed.
2. To find the file slack of a file simply find the file slack file that is linked to the file you have
selected and in the ‘Properties’ pane again and the size will be listed.
3. To further find how many clusters are (depending on how many bytes are in the clusters)
present in a file you need to add the length of the file and the file slack together divided
by how the size of each individual cluster in bytes.
LOOKING AT A FILE AND ITS FILE SLACK

Figure 2: The length


of the file shown at
the bottom of the
pane.

Figure 1: Properties
tab in FTK.
VOLUME BOOT RECORD(MBR) AND MASTER BOOT
RECORD(MBR)
• The Volume Boot Record is the first sector of the
Windows Boot Sector where there are eight
more sectors that follow this. Seven of the
sectors consist of 512-byte sectors with an extra
40 bytes to start the eighth sector (Sedory,
2015).
• The Master Boot Record (sometimes known as
‘Master Partition Table’) is the first sector of the
whole disk on any system and is used to identify
how and operating system is located and
where to ensure it can be loaded into the RAM
(Random Access Memory) or the computer’s
main system (Rouse, 2005). The MBR is the first
part of the disk to be read. In FTK, in order to
view just the contents of the MBR, simply click
‘Unpartitioned Space’ in the File List and MBR
will appear as shown in Figure 3.
Figure 3: The view of MBR.
NTFS File system
• New Technology File System (NTFS) was first
created and released by Microsoft in 1993 and
is now the main file system used within the
Windows operating systems from Windows
2000 onwards. From Windows 7 new
computers will default set their file systems to
NTFS (Domingo, 2013). The NTFS file system is
able to support hard drives just under 16
exabytes whereby each file size is individually
capped at 256 terabytes in Windows 8 and
later versions.
• Within NTFS contains the Master File Table
(MFT) which is a file that consists of information
that is 1024-byte records in every other file or
directory within the NTFS volume. The
operating system needs to collect the files
where the data is stored in the MFT. This may
be the creation date and time to the name
and size of the file etc (Beal, n.d.). Then as
shown in Figure 4 the $MFT file will look
something similar to this when inspecting the
element. Furthermore, in the ‘Properties’ pane
will contain the ‘MFT Record Number’ Figure 4: The layout
(Butterfield, n.d.). of an $MFT file.
Importance of Malware detection

• Early Threat Identification Effective malware detection helps in


identifying threats at an early stage, preventing potential damage
to systems and data.
• Security Vulnerabilities Mitigation Detecting and removing malware
assists in mitigating security vulnerabilities, reducing the risk of
breaches and data theft.
• Compliance and Regulatory Requirements Malware detection is
essential for meeting compliance standards and regulatory
requirements, ensuring data protection and integrity. Techniques for
Malware Detect
Signature based detection
• Signature-based detection uses the unique digital footprint, known as a signature,
of software programs running on a protected system. Antivirus programs scan
software, identifies their signature and compares it to signatures of known
malware.
• Antivirus products use a large database of known malware signatures, typically
maintained by a security research team operated by the antivirus vendor. This
database is frequently updated and the latest version is synchronized with
protected devices.
• When an antivirus program identifies software that meets a known signature, it
stops the process and either quarantines or deletes it. This is a simple and effective
approach to malware detection and is important as the first line of defense.
However, as attackers become more sophisticated, the signature-based
approach cannot detect a wide variety of newer threats.
Machine Learning Behavioral Analysis
• The above techniques are known as “static” detection techniques because they rely on
binary rules that either match or do not match a process running in the environment. Static
malware detection cannot learn, it can only add more rules or fine-tune its rules over time
to increase coverage.
• By contrast, new dynamic techniques, based on artificial intelligence and machine
learning (AI/ML), can help security tools learn to differentiate between legitimate and
malicious files and processes, even if they do not match any known pattern or signature.
They do this by observing file behavior, network traffic, frequency of processes,
deployment patterns, and more. Over time, these algorithms can learn what “bad” files
look like, making it possible to detect new and unknown malware.
• AI/ML malware detection is known as “behavioral” detection because it is based on an
analysis of the behavior of suspect processes. These algorithms have a threshold for
malicious behavior, and if a file or process exhibits unusual behavior that crosses the
threshold, they determine it to be malicious.
• Behavioral analysis is powerful, but can sometimes miss malicious processes or incorrectly
classify legitimate processes as malicious. In addition, attackers can manipulate AI/ML
training processes. In several cases, attackers fed specially-crafted artifacts to a behavioral
analysis mechanism, to train it to recognize malicious software as safe.
Malware Analysis technique
• Malware analysis is necessary to develop effective malware detection technique.
It is the process of analyzing the purpose and functionality of a malware, so the
goal of malware analysis is to understand how a specific piece of malware works
so that defense can be built to protect the organization’s network. There are
three types of malware analysis which achieve the same goal of explaining, how
malware works, their effects on the system but the tools, time and skills required to
perform the analysis are very different.
Techniques for Malware Analysis are
• Static analysis
• Dynamic analysis
• Hybrid analysis
Static analysis
• It is also called as code analysis.
• It is the process of analyzing the program by examining it i.e. software code of
malware is observed to gain the knowledge of how malware’s functions work.
• In this technique reverse engineering is performed by using disassemble tool,
decompile tool, debugger, source code analyzer tools such as IDA Pro and
Ollydbg in order to understand structure of malware [9].
• Before program is executed, static information is found in the executable
including header data and the sequence of bytes is used to determine whether it
is malicious.
• Disassembly technique is one of the techniques of static analysis. With static
analysis executable file is disassembled using disassemble tools like XXD,
Hexdump, NetWide command, to get the assembly language program file. From
this file the opcode is extracted as a feature to statically analyze the application
behavior to detect the malware
Dynamic analysis
• It is also called as behavioral analysis.
• Analysis of infected file during its execution is known as dynamic analysis [2].
• Infected files are analyzed in simulated environment like a virtual machine, simulator,
emulator, sandbox etc [6]. After that malware researchers use SysAnalyzer, Process
Explorer, ProcMon, RegShot, and other tools to identify the general behavior of file [9].
• In dynamic analysis the file is detected after executing it in real environment, during
execution of file its system interaction, its behavior and effect on the machine are
monitored.
• The advantage of dynamic analysis is that it accurately analyzes the known as well as
unknown, new malware. It’s easy to detect unknown malware also it can analyze the
obfuscated, polymorphic malware by observing their behavior but this analysis technique is
more time consuming. It requires as much time as to prepare the environment for malware
analysis such as virtual machine environment or sandboxes.
Hybrid analysis

• This technique is proposed to overcome the limitations of static and


dynamic analysis techniques. It firstly analyses the signature
specification of any malware code & then combines it with the
other behavioral parameters for enhancement of complete
malware analysis. Due to this approach hybrid analysis overcomes
the limitations of both static and dynamic analysis .
Server logs
web server log is a text document that contains a
record of all activity related to a specific web server
over a defined period of time. The web server gathers
data automatically and constantly to provide
administrators with insight into how and when a server is
used, as well as the users that correspond with that
activity.
Server log content and values

Each line within the server log file contains significant information,
including:
• The device’s IP address
• Request method
• Date and time of the request
• Status of the request
• Referrer method
• User-Agent
• Requested file information, including file name, size and network location
Why do we need server logs

• Optimize limited IT resources, including staff


• Establish dedicated logging levels and prioritize activity based on impact to the
business or severity of the issue
• Address and debug HTTP errors
• Identify and fix broken links from external sources
• Streamline the user journey based on typical navigation patterns
• Adapt other business activity, such as sales, marketing or partner outreach
• Identify security risks and issues, including the presence of bots, malicious code or
spam
Volume Shadow copies
• The Volume Shadow Copy Service provides an infrastructure for creating point-in-
time snapshots (shadow copies) of volumes. Shadow Copy supports 64 shadow
copies per volume.
• A shadow copy contains previous versions of the files or folders contained on a
volume at a specific point in time. While the shadow copy mechanism is
managed at the server, previous versions of files and folders are only available
over the network from clients, and are seen on a per folder or file level, and not as
an entire volume.
• The shadow copy feature uses data blocks. As changes are made to the file
system, the Shadow Copy Service copies the original blocks to a special cache
file to maintain a consistent view of the file at a particular point in time. Because
the snapshot only contains a subset of the original blocks, the cache file is
typically smaller than the original volume. In the snapshot's original form, it takes
up no space because blocks are not moved until an update to the disk occurs.
Volume Shadow copies
By using shadow copies, an StoreEasy 1000 Storage system can maintain a set of previous
versions of all files on the selected volumes. End users access the file or folder by using a
separate client add-on program, which enables them to view the file in Windows Explorer.
Accessing previous versions of files, or shadow copies, enables users to:
• Recover files that were accidentally deleted. Previous versions can be opened and
copied to a safe location.
• Recover from accidentally overwriting a file. A previous version of that file can be
accessed.
• Compare several versions of a file while working. Use previous versions to compare
changes between two versions of a file.
• Shadow copies cannot replace the current backup, archive, or business recovery system,
but they can help to simplify restore procedures. Because a snapshot only contains a
portion of the original data blocks, shadow copies cannot protect against data loss due to
media failures. However, the strength of snapshots is the ability to instantly recover data
from shadow copies, reducing the number of times needed to restore data from tape .

You might also like