100% Guaranteed Results


COMP5350 – Solved
$ 29.99
Category:

Description

5/5 – (1 vote)

Automated File Recovery and Network Forensics
Adia Foster, Mary Mitchell, Vicki McLendon
Executive Summary
Project 2 was divided into two parts–a Python script and a HackTheBox challenge. The goal of the python script was to craft an automated file recovery script based on file signatures. We achieved this by writing code that took in the disk image name through the command line arguments and then opened the disk image. After opening the disk image, we got the contents of the disk image and converted them to hexadecimal. We then looped through the signature dictionary we created and for each signature, we found the corresponding files on the image and recovered them. Below is the name and screenshot of each file we found.
File1.mpg
File2.pdf
File3.pdf
File4.bmp

File5.gif
File6.gif
File7.jpg
File8.jpg
File9.jpg
File10.docx
File11.avi audio file
File12.avi audio file
File13.png
The HackTheBox challenge was run using the following call in the Linux Parrot terminal: “nmap -v [host’s IP address].” We used the -v flag for the verbose listing. This allowed us to gain more insight into the open network ports and the protocols and services on those ports. We found that there were 17 open network ports. The table below lists the open ports and the protocol that was used on it.
53 TCP
88 TCP
135 TCP
139 TCP
389 TCP
445 TCP
464 TCP
593 TCP
636 TCP
3268 TCP
3269 TCP
49152 TCP
49153 TCP
49154 TCP
49155 TCP
49157 TCP
49158 TCP
We go deeper into our process throughout the rest of the report.
Table of Contents
Executive Summary 2
Table of Contents 5
List of Figures 6
1 Introduction 9
2 Automated File Recovery 9
3 HackTheBox Network Forensics 11
4 Conclusions and recommendations 13
5 References 13

List of Figures
Figure 1: Program Output Files 1 – 5

Figure 2: Program Output Files 6 – 11

Figure 3: Program Output Files 12 – 13

Figure 4: Network Scan Part 1

Figure 5: Network Scan Part 2

1 Introduction
For the second project in our Digital Forensics course, our team was tasked with two assignments. The first of these assignments was to develop a python script that would take in a disk image, locate file signatures, recover the user generated files, and generate a SHA-256 hash for each recovered file. The second assignment involved conducting a network scan in our HackTheBox lab and then analyzing and discussing our findings.
2 Automated File Recovery
For the first part of the project, we programmed a python script to analyze a disk image and recover certain types of files from the disk. To do this, we first identified all the file signatures and trailers for the file types that we wanted to recover using Gary Kessler’s file signatures table[2]. We then put these signatures and trailers into python dictionaries to be used later on in the program. Our next step was to open the disk image and get the hexadecimal contents from the disk, which we did by using python’s open() function and reading the file in binary mode. Then we used the python hex() function to convert the binary contents read from the disk into hexadecimal[3].
Once the disk could be opened, we wrote our method that would locate the file signatures and recover the files. The main idea behind this function was to find all the files of a certain type and then move on to the next file type. We did this by looping through each of the file signatures in our signature dictionary and for each file signature, we would use the python find() function to locate instances of the signature in the disk contents. Once an instance of the signature we were looking for was found, we checked to make sure that it was at the beginning of a sector. This was to make sure that the file signature that was found was not just part of the contents of another file, but was actually a header. We used the modulus operator and the standard sector size of 512 bytes to determine if the file signature was at the beginning of a

sector. However, some of the file signatures were quite short and required further confirmation that the signature was actually the header of a file and not a false positive. This was a problem with the BMP file types, and to solve it we looked for the four reserved bytes containing all zeros that came after the two file signature bytes and the four file size bytes. If a false positive was identified, then we would move the starting point for our search past the false positive and continue searching the rest of the disk. Once an actual file header was identified, we could calculate the starting offset of the file by dividing by two. The reason we had to do this was because our disk contents were in hexadecimal and every byte is represented by two hexadecimal characters, so dividing by two gave us the starting offset of the file in decimal. Next we identified the end of the file by searching for the corresponding trailer for each file type, by once again using the python find() method. When the footer of the file was identified, we could calculate the file size from the difference of the starting offset and the ending offset. However, some of the footers were difficult to identify since some file types like PDFs can have multiple end of file markers and some trailers might be a part of the file’s contents and not the footer. To deal with this we had our program check for trailing zeros after the footers to make sure that we were actually at the end of the file. This was only a problem if the file type did not specify the file size in the header like BMP and AVI files. For those file types, the four file size bytes could be identified based on their respective header layouts and converted from hexadecimal to decimal. The file size could then be added to the start of the file to determine the ending offset. After all the location information was calculated, the only thing left to do was to recover the files and calculate the hashes. To recover the files, we created ‘dd’ commands, like the ones we used in the first project, using the information we calculated about the location of the files and their sizes. Once the command was put into a string, we used the python OS library and its system() method which interacts with the operating system to execute the command[4]. We used this same methodology to calculate the SHA-256 hashes, but instead of executing ‘dd’ commands, we executed ‘sha256sum’. The results from the file were then printed out in the linux terminal and the recovered file was placed in the current directory. This process was repeated until all files of a certain type were found, and then our method would move on to the next type. Once all file types were searched for and recovered, we printed the total number of files that were found and terminated the program.
The results of our script after analyzing the Project2.dd disk image that was provided to us can be found in Figures 1, 2, and 3. The screenshots shown in the figures display the number of files recovered, the starting and ending offsets for each file, as well as the results of the recovery and sha-256 commands. The recovered files can be found in a zip file along with this report. To see the results of the python script in action the following command can be run in a linux terminal: ‘python3 FileRecovery.py Project2.dd’. The program could take a couple minutes to run completely especially if there are larger files to recover, so please be aware of that.
3 HackTheBox Network Forensics
Our objective for the second portion of this project was to do a network scan and analysis of the HackTheBox Active lab. The first step we took was to connect to the dedicated lab using OpenVPN. Once connected to a machine, we performed the network scan in a Linux Parrot terminal using the following nmap command: “nmap -v 10.129.221.1” (Figures 4 and 5).
This command displayed the open network ports and the protocols and services on those ports. We chose to use the -v for verbose output to see a more detailed scan. We found that the protocol used on all of the open ports was Transmission Control Protocol (TCP). TCP is a connection-oriented protocol that provides reliable transport between sending and receiving processes. It also employs flow control so that the sender does not overwhelm the receiver, and congestion control in order to throttle the sender when the network gets overloaded. TCP is more reliable than UDP since it requires an acknowledgement that the packets have been received. If the acknowledgement is not received before a timeout window, the packet or packets are sent again, creating robustness.
Our next step for this part of the assignment was to look into the seventeen open network ports that the nmap scan revealed. One of the open ports was port 53 which is the port that DNS (Domain Name Server) servers listen in on in order to translate domain names, or common website names, into IP addresses. Port 88 was also open and using it was Kerberos, an authentication service. On port 135 was Microsoft Remote Procedure Call, a service that deals with interprocess communication between clients and servers, and on port 139 was NetBIOS, a service that supports connection-oriented file sharing. Lightweight Directory Access
Protocol, which is used to look up contact info from servers , was listening on network port 389.
Additionally, on port 445 was Microsoft-DS which is another protocol used for file sharing. However, port 445 can leave machines vulnerable to attacks if left open. Port 464 was running kpasswd5, a service used for changing a Kerberos password and is also noted to contain vulnerabilities. A Remote procedure call over Hypertext Transfer Protocol service was listening on port 593 and avLightweight Directory Access Protocol over Secure Sockets Layer (SSL) service was listening on port 636. The last two known services were Global Catalog LDAP and Global Catalog LDAP over SSL on ports 3268 and 3269. Finally, the rest of the ports open were numbers 49152 – 49155, 49157, and 49158 which are generally used for applications.
An additional 982 ports were scanned on the host IP of 10.129.221.1. These TCP ports on the host IP were closed. The host was scanned in 1.48 seconds and sent 1,004 raw packets and received 1,001 back. Only 3 packets were lost The nmap scan also told us that there was a
0.080 second latency (Figure 5).
4 Conclusions and recommendations
Our group was able to recover 13 files, using our automated recovery script, and review 17 open network ports. Some recommendations we would do to improve our process would be to make our code more efficient. In its current state, our code takes some time to run.
5 References
[3] How to read binary files as hex in python?: https://stackoverflow.com/questions/34687516/how-to-read-binary-files-as-hex-in-python.

Reviews

There are no reviews yet.

Be the first to review “COMP5350 – Solved”

Your email address will not be published. Required fields are marked *

Related products