Lab 1: Triage Unknown Binary Files
Overview
This lab will guide you through the process of analyzing unknown binary files using various command-line tools and techniques. By the end, you’ll understand how to determine file types, inspect binary contents, and extract useful information.
Goals:
- Understand magic bytes and their role in identifying file types.
- Use
xxd
to get an overview of patterns in binary files. - Employ the
file
command for initial filetype analysis. - Utilize the
strings
utility to extract readable text from binaries.
Estimated Time: 45 Minutes
Instructions
Understanding Magic Bytes
Magic Bytes are specific sequences at the beginning of a file that help identify its format. Use online resources or documentation to explore how magic bytes work and their limitations.
Question
Research and write down the magic byte patterns for the following:
- PNG Image Files
- Zip Compressed Files
- Linux ELF Executables
- Windows PE Executables
Analyzing patterns with Hex Viewers
Download the following files and use the xxd
command to view a hex/ascii
representation of each. Explore both the hex and binary overview displays to
identify any recognizable patterns or structures.
Tip
Pipe the output to the less
command to scroll and search in the terminal.
xxd <file> | less
Use /
to enter search mode while using less
. Type something and press
enter. Then use n
and Shift+N
to find the next match or the previous one
respectively.
Or use your favorite editor:
xxd <file> > output.txt
emacs output.txt
Question
Describe any structure that you see in the hex representation of the bytes. What about the ascii representation?
Extracting plaintext Strings
Use the strings
command to find readable text within each binary. Experiment
with the -n
flag to look for strings of different lengths.
strings unknown_file.bin
Question
Is it difficult to find interesting strings in from the defaultstrings
output?
How does this change if you use the-n
flag?
Do the strings give you any ideas about what this filetype is used for?
Tip
The strings
command also supports different
string encodings. Try
using some of the other strings
command line flags to find UTF-16 LE (little
endian) strings in TODO: filename
.
Using the file
Command
Use the file
command to get a description each file’s type.
file [file-1 file-2 ...]
Alternatively, you can upload the file into a browser based version to get similar results. For small files, you can use a version without leaving this site!
Question
What types of files are the above?
Does this match the results from your initial magic bytes analysis?
Recursive Analysis
At least one of the files you downloaded is a compressed archive containing
other files. Use the unzip
program to list its contents without extracting
them. Then decompress the files to a specific directory of your choice.
# List the contents of a zip file
unzip -l file.zip
# Unzip to a directory
unzip -d <DIR> file.whatever
Question
Examine the contents found within the compressed files you downloaded earlier and describe what kinds of files are contained within them.
General Lab Tips
Tip
- Use the
man
command to learn about how to use various Linux command line utilities. This information also be browsed at many different sites- When in doubt, google “man command” and read the result.
- When analyzing magic bytes, remember they are not foolproof. Files can be mislabeled.
- The
file
command is a quick way to get a general idea but might not always be accurate.
Submission
.pdf
file.