Enhancing Machine Learning Models in Cybersecurity for combating cunning malware in the year EMBER2024.

Cybersecurity researchers from our website unveiled the fresh version of EMBER2024, an open-source malware dataset, at KDD 2025, designed to progress machine learning models in the field of cybersecurity.

, and Administrator

2025 September 13 . 6:48 PM

2 min read

Enhancing Machine Learning Models in Cybersecurity for Combat Against Elusive Malware in EMBER2024

Enhancing Machine Learning Models in Cybersecurity for combating cunning malware in the year EMBER2024.

In the realm of cybersecurity, the challenge of developing effective solutions to combat malware continues to evolve. The latest advancement comes in the form of EMBER2024, an update to the original EMBER dataset, which was first released in 2018.

EMBER2024, presented at the SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-2025) in Toronto in August 2025, builds on the innovative and influential original, delivering a leap forward in capability. An academic paper, titled "EMBER2024: A Benchmark Dataset for Holistic Evaluation of Malware Classifiers", details this new dataset.

The new version of the dataset, developed by researchers Florian Biggio, Ambra Demontis, Battista Biggio, and Fabio Roli, includes a challenge set of 6,315 files that were initially undetected as malicious by any AV products in VirusTotal but later qualified as malicious. This set reflects the difficulties of training and shipping a real commercial AV solution by highlighting the hardest files to classify.

The EMBER2024 dataset, spanning over 3.2 million files from six different file formats, includes metadata, labels, and calculated features. The feature calculation code was updated to use the most recent version of the pefile library instead of LIEF, ensuring compatibility with future versions of Python.

EMBER2024 also includes a collection of advanced malware that has demonstrated its ability to evade antivirus products. The dataset provides data scientists conducting cybersecurity research with an extensive, modern dataset to support the training and evaluation of machine learning models for malware detection.

Moreover, the dataset features seven different types of labels and tags that support training classifiers on seven common tasks, including malicious/benign detection, malware family classification, and malware behavior identification. These labels provide a comprehensive view of the malware landscape, aiding in the development of more robust and versatile detection systems.

The paper includes 14 benchmark models trained on different subsets of the data and varying classification tasks. The results of these models serve as a benchmark for future research in the field.

The popularity of the original EMBER dataset has led to related projects like EMBERSim and now EMBER2024. This infrastructure code allows for the potential creation of a larger dataset for larger models or studies about the evolution of benign and malicious software over time.

Open source initiatives like EMBER2024 exemplify industry-wide cooperation that drives innovation and supports continuous product improvement. The EMBER2024 public release includes the code used to construct the dataset, allowing researchers with access to VirusTotal to replicate the dataset in the future.

The EMBER2024 project reflects the ongoing commitment of the website to research in the cybersecurity industry. As of this writing, the original EMBER paper has been cited in academic research over 700 times since its original publication in 2018, underscoring its significance in the field.

In conclusion, the EMBER2024 dataset offers a valuable resource for researchers and practitioners in the field of cybersecurity, providing a comprehensive and up-to-date dataset for training and evaluating malware detection models. The dataset's focus on challenging files and advanced malware makes it an invaluable tool in the ongoing battle against cyber threats.

Latest

Analysis of Supply Security by the Federal Network Agency

renewable-energy

Insight into the Supply Security Report by the Federal Network Agency

Explore the environmental party's efforts in the German Parliament: ambitisions, projects, and legislations geared towards an eco-friendly, equitable, and forward-thinking society.

, and Administrator

2025 September 13

Fatal Hopewell collision claims life of Oswego resident

Car-accidents

Fatal Hopewell accident involves individual from Oswego

Fatal crash in Town of Hopewell identified victim as 32-year-old Benjamin Best from Oswego. The accident occurred around 7 a.m. on August 13 along Route 5 & 20 near 3735 State Route 5&20. Ontario County Sheriff's deputies confirmed the unfortunate demise of Best at the scene. Further details...

, and Administrator

2025 September 13

Governor Hochul introduces web platform for mobile phone prohibition guidelines in schools

**Headline:** Spin into Fun: Explore Our Top Games!

Governors enact website outlining guidelines for cellphone prohibition in schools

Governor Kathy Hochul launches a website, ny.gov/phonefree, detailing each school district's approach to enforcing New York's new law banning personal internet-enabled devices during school hours in public K-12 schools. Instead of implementing a uniform policy, districts are tasked with...

, and Administrator

2025 September 13

Yankee triumph over Cardinals, owing to Ben Rice's impressive seven-RBI performance, secures series...

Sports

Yankees clinch series victory against Cardinals, propelled by Ben Rice's seven-run batting performance (video summary)

In a thrilling night at Busch Stadium, Ben Rice shone for the New York Yankees, chalking up a career-best seven RBIs with a home run, double, and single. His efforts, combined with Aaron Judge's home run and two RBIs, as well as Trent Grisham's four hits, propelled the Yankees to a commanding...

, and Administrator

2025 September 13

Enhancing Machine Learning Models in Cybersecurity for combating cunning malware in the year EMBER2024.

Enhancing Machine Learning Models in Cybersecurity for combating cunning malware in the year EMBER2024.

Read also:

Related

Latest