Machine learning in cyber security: The first line of defense against modern threats

Dr. Sven Krasser, Chief Scientist at CrowdStrike, talks to CBR’s Ellie Burns about how machine learning could not only be a vital tool in the fight against modern threats, but also a way to derive more value from data and intelligence.

EB: Where do you think cyber security is failing in regard to AI/machine learning?

SK: If properly managed and leveraged, machine learning can be a real force amplifier for cyber teams. Machine learning analyses security-related data, including file “features” and behavioral indicators over a massive data set. Often times billions of events can be used to “train” the system to detect unknown and never seen attacks based upon past behaviors. If machine learning algorithms are trained with data-rich sources, and augmented with behavioral analytics, they can be an extremely effective first line of defense against modern threats like ransomware. In reality, most companies don’t have the threat telemetry to train machine learning and that limits the effectiveness of the algorithms.

EB: On the flip side how can machine learning be used to fight cyber security threats?

SK: Firstly, malware detection. There are on average more than 10 million new malware files every month. Signature based approaches are no longer useful, and its difficult to keep up with the amount of new malware threats. Augmenting machine learning with behavior-based prevention allows it to recognise and block never-before-seen variants and even fileless malware.

Machine learning cyber security: The first line of defense against modern threats
Dr. Sven Krasser currently serves as Chief Scientist for CrowdStrike, where he oversees the development of endhost and cloud-based Big Data technologies. Previously, Dr. Krasser was at McAfee where he led the data analysis and classification efforts for TrustedSource.
Secondly, attackers also use exploitation techniques. It is very challenging for traditional approaches to find these. Machine learning can help by analysing data at a larger scale, and has the advantage of more breadth than can be achieved by a human. More scale means organisations can pull larger amounts of data into the analysis. Especially for advanced threats, a lot of data is needed before trends and problems can be spotted.

EB: How is CrowdStrike utilising machine learning in regards to cybersecurity?

SK: The differentiator for CrowdStrike’s machine learning engine is that it is predicated on a cloud-based architecture operating on algorithms infused with the collective knowledge of a crowdsourced community where threat intelligence is aggregated and updated instantly. In our case, the data is generated by our ability to process almost 34 billion events (and growing) on a daily basis. Also, because our approach goes beyond signatures, we have the ability to prevent unknown and new malware variants, including fileless malware.

EB: Why should a company consider using machine learning as part of a cyber security strategy?

SK: The number of threats and their variations is exploding, and deriving signatures one-by-one for each of them is a losing proposition. Additionally, modern threats blend into the environment and only subtly differ from legitimate usage patterns. Detecting them requires looking at a larger amount of data. On both counts, machine learning can help. It brings two advantages to the table: analysing data at a larger scale and more breadth than can be achieved by a human analyst.

EB: What would your top advice be for companies looking to use machine learning in cyber security?

SK: To defeat sophisticated cyber attacks, enterprises need technology, expertise and high-grade threat intelligence. If applied correctly, machine learning can dramatically augment an organisation’s ability to fight off these attacks while deriving more value out of security data and threat intelligence.

However, value that machine learning can bring to the table largely depends on the data available to feed into it. Machine learning cannot create knowledge, it can only extract it. The scope and size of data are most critical for effective machine learning. Companies should therefore assess how much data they have available to make machine learning a viable option.