Abstract:
Today’s world is connected through the Internet, everyone can connect each
other and people do business on the Internet like online shopping and online banking.
Social networking sites are widely used and people save their sensitive data on
connected computer or laptop or mobile phone. Therefore, information security is
increasingly important to be protected properly. Remote Access Trojan (RAT), a kind
of malware, is one of malwares that disclose confidential information to the wrong party.
It passes through network to give command to the victim and control it remotely.
Researchers have proposed many approaches to detect such kind of malware. However,
threat actors use canning ways to create new malwares, and so new Remote Access
Trojans and variants of existing RATs are emerging every day. The popular Advanced
Persistent Threat (APT) and targeted attacks also use the command and control
communication like RAT to intrude and control a victim remotely.
There are three main challenges facing Remote Access Trojan detection. First,
signature-based detection is not enough to catch up RATs that camouflage themselves
by using encryption and polymorphism. Second, there is lack of effective features for
machine learning methods to identify the behavior of Remote Access Trojans although
behavior-based detection is useful for detecting unknown malware. Third, there is much
overhead and time takes long in extracting features from a session that starts SYN
packet of TCP three-way handshake to the end of the traffic in network traffic
classification.
Both network-based and behavior-based approaches are applied in developing
software for malware detection. Network behavioral analysis has been done many years
for two objectives: to detect the command and control traffic of malwares like Remote
Access Trojans so that confidential data cannot be disclosed to the wrong person, and
to classify network traffic so that network administrator can manage his or her related
network easily.
Network behavioral analysis is done and a new approach of feature extraction
for detecting Remote Access Trojans in the early stage is proposed and implemented in
this thesis. The malicious behavior of Remote Access Trojans is differentiated from
normal network traffic in the first twenty packets of network traces. Four machine
learning algorithms are applied for classification and their parameters are tuned
perfectly in order to obtain the best performance. This thesis makes three primaryiii
contributions. First, the detection solutions proposed fundamental behavior of Remote
Access Trojans and are immune to malware obfuscation and traffic encryption. Second,
the solutions are general enough to identify different types of Remote Access Trojans
and they can also be extended to counter next-generation Remote Access Trojans. Third,
the detection solutions function to meet the needs of network traffic classification in
order to differentiate normal traffic and malicious one.
By accomplishing this approach, Remote Access Trojans are detected in the
early stage and it is not necessary to wait until the end of the network traffic in capturing
network traffic for network traffic classification. It approaches to achieve the objectives
of information security: Confidentiality, Integrity and Availability (CIA). Limitations
and future work are also explained clearly.