All Projects → ocatak → malware_api_class

ocatak / malware_api_class

Licence: MIT license
Malware dataset for security researchers, data scientists. Public malware dataset generated by Cuckoo Sandbox based on Windows OS API calls analysis for cyber security researchers

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to malware api class

Pafish
Pafish is a testing tool that uses different techniques to detect virtual machines and malware analysis environments in the same way that malware families do
Stars: ✭ 2,026 (+1411.94%)
Mutual labels:  sandbox, malware, malware-families
memscrimper
Code for the DIMVA 2018 paper: "MemScrimper: Time- and Space-Efficient Storage of Malware Sandbox Memory Dumps"
Stars: ✭ 25 (-81.34%)
Mutual labels:  sandbox, malware
unprotect
Unprotect is a python tool for parsing PE malware and extract evasion techniques.
Stars: ✭ 75 (-44.03%)
Mutual labels:  sandbox, malware
fake-sandbox
👁‍🗨 This script will simulate fake processes of analysis sandbox/VM software that some malware will try to avoid.
Stars: ✭ 110 (-17.91%)
Mutual labels:  sandbox, malware
HomebrewOverlay
Browser extension adware (showHomebrewOverlayOuter)
Stars: ✭ 52 (-61.19%)
Mutual labels:  malware, adware
Nginx Ultimate Bad Bot Blocker
Nginx Block Bad Bots, Spam Referrer Blocker, Vulnerability Scanners, User-Agents, Malware, Adware, Ransomware, Malicious Sites, with anti-DDOS, Wordpress Theme Detector Blocking and Fail2Ban Jail for Repeat Offenders
Stars: ✭ 2,351 (+1654.48%)
Mutual labels:  malware, adware
Automated-Malware-Analysis-List
My personal Automated Malware Analysis Sandboxes and Services
Stars: ✭ 20 (-85.07%)
Mutual labels:  sandbox, malware
additional-hosts
🛡 List of categorized undesired hosts
Stars: ✭ 13 (-90.3%)
Mutual labels:  malware, adware
Drakvuf Sandbox
DRAKVUF Sandbox - automated hypervisor-level malware analysis system
Stars: ✭ 384 (+186.57%)
Mutual labels:  sandbox, malware
Norimaci
Norimaci is a simple and lightweight malware analysis sandbox for macOS
Stars: ✭ 37 (-72.39%)
Mutual labels:  sandbox, malware
Mba
Malware Behavior Analyzer
Stars: ✭ 125 (-6.72%)
Mutual labels:  sandbox, malware
Malware-Sample-Sources
Malware Sample Sources
Stars: ✭ 214 (+59.7%)
Mutual labels:  malware, malware-dataset
Bold-Falcon
毕方智能云沙箱(Bold-Falcon)是一个开源的自动化恶意软件分析系统;方班网络安全综合实验-设计类;
Stars: ✭ 30 (-77.61%)
Mutual labels:  sandbox, malware
Docker Cuckoo
Cuckoo Sandbox Dockerfile
Stars: ✭ 289 (+115.67%)
Mutual labels:  sandbox, malware
lkm-sandbox
Collection of Linux Kernel Modules and PoC to discover, learn and practice Linux Kernel Development
Stars: ✭ 36 (-73.13%)
Mutual labels:  study, sandbox
rhino
Agile Sandbox for analyzing Windows, Linux and macOS malware and execution behaviors
Stars: ✭ 49 (-63.43%)
Mutual labels:  sandbox, malware
bpfbox
🐝 BPFBox 📦 Exploring process confinement in eBPF
Stars: ✭ 93 (-30.6%)
Mutual labels:  sandbox
DFIR Resources REvil Kaseya
Resources for DFIR Professionals Responding to the REvil Ransomware Kaseya Supply Chain Attack
Stars: ✭ 172 (+28.36%)
Mutual labels:  malware
Tools
Combination of different utilities, have fun!
Stars: ✭ 166 (+23.88%)
Mutual labels:  sandbox
Chase
Automatic trading bot (WIP)
Stars: ✭ 73 (-45.52%)
Mutual labels:  lstm-neural-networks

Total Downloads

Windows Malware Dataset with PE API Calls

Our public malware dataset generated by Cuckoo Sandbox based on Windows OS API calls analysis for cyber security researchers for malware analysis in csv file format for machine learning applications.

Cite The DataSet
If you find those results useful please cite them :

@article{10.7717/peerj-cs.346,
 title = {Data augmentation based malware detection using convolutional neural networks},
 author = {Catak, Ferhat Ozgur and Ahmed, Javed and Sahinbas, Kevser and Khand, Zahid Hussain},
 year = 2021,
 month = jan,
 keywords = {Convolutional neural networks, Cybersecurity, Image augmentation, Malware analysis},
 volume = 7,
 pages = {e346},
 journal = {PeerJ Computer Science},
 issn = {2376-5992},
 url = {https://doi.org/10.7717/peerj-cs.346},
 doi = {10.7717/peerj-cs.346}
}

Publications

The details of the Mal-API-2019 dataset are published in following the papers:

  • [Link] AF. Yazı, FÖ Çatak, E. Gül, Classification of Metamorphic Malware with Deep Learning (LSTM), IEEE Signal Processing and Applications Conference, 2019.
  • [Link] Catak, FÖ., Yazi, AF., A Benchmark API Call Dataset for Windows PE Malware Classification, arXiv:1905.01999, 2019.

Introduction

This study seeks to obtain data which will help to address machine learning based malware research gaps. The specific objective of this study is to build a benchmark dataset for Windows operating system API calls of various malware. This is the first study to undertake metamorphic malware to build sequential API calls. It is hoped that this research will contribute to a deeper understanding of how metamorphic malware change their behavior (i.e. API calls) by adding meaningless opcodes with their own dissembler/assembler parts.

Malware Types and System Overall

In our research, we have translated the families produced by each of the software into 8 main malware families: Trojan, Backdoor, Downloader, Worms, Spyware Adware, Dropper, Virus. Table 1 shows the number of malware belonging to malware families in our data set. As you can see in the table, the number of samples of other malware families except AdWare is quite close to each other. There is such a difference because we don't find too much of malware from the adware malware family.

Malware Family Samples Description
Spyware 832 enables a user to obtain covert information about another's computer activities by transmitting data covertly from their hard drive.
Downloader 1001 share the primary functionality of downloading content.
Trojan 1001 misleads users of its true intent.
Worms 1001 spreads copies of itself from computer to computer.
Adware 379 hides on your device and serves you advertisements.
Dropper 891 surreptitiously carries viruses, back doors and other malicious software so they can be executed on the compromised machine.
Virus 1001 designed to spread from host to host and has the ability to replicate itself.
Backdoor 1001 a technique in which a system security mechanism is bypassed undetectably to access a computer or its data.

Figure shows the general flow of the generation of the malware data set. As shown in the figure, we have obtained the MD5 hash values of the malware we collect from Github. We searched these hash values using the VirusTotal API, and we have obtained the families of these malicious software from the reports of 67 different antivirus software in VirusTotal. We have observed that the malicious software families found in the reports of these 67 different antivirus software in VirusTotal are different.

Screenshot

Data Description

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].