GLOBAL RESEARCH SYNDICATE
No Result
View All Result
  • Login
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
No Result
View All Result
globalresearchsyndicate
No Result
View All Result
Home Data Analysis

ABCs of UEBA: M is for Machine Learning

globalresearchsyndicate by globalresearchsyndicate
November 13, 2019
in Data Analysis
0
ABCs of UEBA: M is for Machine Learning
0
SHARES
21
VIEWS
Share on FacebookShare on Twitter

If log data is the life blood of User and Entity Behavior Analytics (UEBA), then Machine Learning (ML) is the brain. Machine learning algorithms ingest data feeds and turn raw data into risk prioritized intelligence. This is the value of ML in the world of behavior analytics. And it happens in real-time, on big data, across all users and entities in the network.

Machine Learning in Action: How Does it all Work?

How does machine learning work as it relates to UEBA? It’s essentially a process that involves the following four phases:

  1. Collect user and entity event data
  2. Perform statistical analysis on the data to figure out which fields are usable
  3. Create variables/features that can capture the information encoded in the data
  4. Apply an algorithm to see which algorithm can best fit the data

Types of Machine Learning Algorithms

There are two main classes of machine learning algorithms: unsupervised learning algorithms and supervised learning algorithms. Within each class there are a number of algorithms. There is no “best” algorithm. It all depends on the data and your goal. Let’s look at a few that apply to UEBA. This is a non-exhaustive list:

  1. Unsupervised Learning Algorithms – characterized by clustering and groups
    • K-means
    • Hierarchical Clustering
    • DBSCAN (Density-based Spatial Clustering of Application with Noise)
    • Local Outlier Factor
    • One Class SVM
  1. Supervised Learning Algorithms – characterized by tags
    • Linear Regression: line
    • Logistic Regression: curve
    • Decision Trees
    • Neural Networks: weights continually adjusted during training
    • Naïve Bayes

Supervised vs Unsupervised: Which Algorithm to use?

When should you use a supervised learning algorithm versus an unsupervised learning algorithm? When you have a dataset that has markers or tags, then it’s easier to use a supervised learning algorithm because the algorithm knows how to do a fit around the data set.

Take, for example, a credit card statement. How do you figure out which charges are fraudulent? If the charges are tagged, then you can use a supervised learning algorithm. Fraud charges would be tagged as “1” and authorized charges tagged as “0”.  In this case, the supervised learning algorithm knows how to distinguish between fraud and non-fraud. If the data is not tagged, however, you would need to use an unsupervised learning algorithm to identify the fraudulent transactions.

Think of it like this: when you are training a dataset to identify cat and dog images, initially the algorithm does not know the difference between a cat or a dog until you tell it. A tag is a label. It gives the algorithm a goal to adjust parameters around. And labels help supervised learning algorithms find patterns in tagged data.

When you label a user and entity behavior, the supervised learning algorithms learn how to distinguish between good and malicious behavior. Generally, there are multiple supervised learning algorithms people use. The most common algorithms in use today are logistic regression, deep neural networks, and linear regression. These are the typical supervised machine learning algorithms used with UEBA platforms, starting from the most simplistic and moving to those delivering more complexity.

Unsupervised learning algorithms are characterized by clustering and groups. You can run an unsupervised learning algorithm to “learn” which data points are similar and which ones are not. For example, let’s say someone is breaking into a machine. We quite literally don’t know what is going on, so we look at the machine logs. We start sampling the data and breaking it down into frequency counts, histograms and time series to see where the averages are. When you pass data through an unsupervised learning algorithm (for example, K-means), it clusters data that are similar. Any outlier points will wind up typically in the smallest cluster. And that is where we find outlier behavior.

Let’s look at another example: say you want to cluster children in a classroom by height. Children with similar heights will be clustered into the same group. You’ll always find an odd person who is either very tall or very short, and these individuals will stand out and form their own clusters. They will be tagged as outliers since the size of the clusters are so small. You can use this mechanism to tag other similar data.

Machine Learning Monitors User and Entity Behaviors

Machine learning algorithms are used to create models. Gurucul uses machine learning models to monitor user and entity behavior at scale.  Take SSH logs. If you analyze SSH logs using a clustering algorithm, you will likely see the same user logging into the same machine or group of machines at approximately the same time(s) every day. However, if this user suddenly logs into a different machine, the clustering algorithm will spike, and this new machine will be put into its own cluster – as an outlier. This behavior is far from the normal behavior exhibited by the user, which is an example of how ML identifies anomalous behavior. The real question is: how risky is this behavior? To ascertain whether anomalous behavior is malicious, we look at additional context. What else is this user doing? What are users in his peer group doing?

When we look at user and entity activity with machine learning models, we use multiple algorithms to get to the truth. For example, Time Series Analysis refers to regression algorithms that use time dependencies within the data. Is the user operating out of bounds – in a certain hour or range of time? Combine that information with results from a K-means algorithm that looks at groups of machines being targeted, and you can have context for determining that user is working off hours, on a system update for example.

How Machine Learning Predicts and Detects Insider Threats

Predicting, detecting, and stopping insider threats is a key UEBA use case. Here is where the machine learning rubber meets the road as they say. Given the appropriate log data, machine learning algorithms can detect if outsiders gain unauthorized access, which is an account compromise insider threat.

How do machine learning models predict malicious insiders? One example is using ML models to perform sentiment analysis on email logs. Sentiment analysis data mines emails to see if someone is going to go off the deep end, so you can stop them before they do. You cannot scan the content of the emails due to data privacy laws, but you can scan the email subject lines, attachments, sender and recipient details. This usually gives you enough information to see what’s going on.

In one use case with a customer, we were doing sender and recipient checks on pairs. All the outliers that jumped out were users who were emailing source code to gmail.com and yahoo.com accounts. To add to the context, we also looked at source code repository logs and discovered that a particular user had been downloading complete source code trees which was very unusual behavior. As you can imagine, that user was quietly escorted from the premises.

The greater the potential damage, the more critical it is to employ ML to predict and prevent that damage. In the technology sector, for example, senior management are renowned for taking intellectual property with them when they move to a competitor. ML takes politics out of the equation and flattens employment hierarchies. If you’re an insider threat, Gurucul UEBA will detect your bad behavior, whether you’re a CEO or an administrator. It’s just data science to us.

How Machine Learning Detects Unknown threats

We talked about the misuse of personnel badges in our previous blog, “ABCs of UEBA: L is for Logs”. That’s one example of how machine learning can detect unknown threats. In another case, we found multiple logins on the same computer from different people in different parts of the company. It turns out that an employee was sharing her login with her manager. Was that an anomaly? Yes. But was it an actual threat? Unclear. However, we detected the anomaly with ML and the behavior was certainly unknown to our customer.

In another case, there was some unusual activity in the system logs and our customer could not figure out what was going on. We ran analytics on the logs and found that someone was logging into multiple accounts using the same cell phone, but from different locations. One was in New York and the other was in Boston. We saw the spike and reported it. It was a complete unknown. It may have been a cloned cell phone, but whatever it was, it wasn’t acceptable behavior. They closed the account and that was that!

Learn more about how Gurucul’s behavior based security analytics implements machine learning models for advanced threat detection and prevention by reading this blog post.

Customize Machine Learning Models with Gurucul STUDIOTM

Sometimes, out-of-the box machine learning models will not yield the most effective results. In such cases, it’s a huge benefit to leverage a UEBA platform that gives you the ability to customize ML models or build your own. Gurucul STUDIO enables you to create custom ML models without coding and a minimal knowledge of data science. Gurucul STUDIO provides a step-by-step graphical interface to select attributes, train models, create baselines, set prediction thresholds and define feedback loops. It supports an open choice for big data and a flex data connector to ingest any on-premises or cloud data source for desired attributes. It also provides an analytics Software Development Kit (SDK) to allow you to build models outside the platform (in Python, Java, whatever you like) and import them into Gurucul UEBA.  Contact us to see a demo or for more information on our UEBA platform.

The post ABCs of UEBA: M is for Machine Learning appeared first on Gurucul.

*** This is a Security Bloggers Network syndicated blog from Blog – Gurucul authored by Jane Grafton. Read the original post at: https://gurucul.com/blog/abcs-of-ueba-m-is-for-machine-learning

Related Posts

How Machine Learning has impacted Consumer Behaviour and Analysis
Consumer Research

How Machine Learning has impacted Consumer Behaviour and Analysis

January 4, 2024
Market Research The Ultimate Weapon for Business Success
Consumer Research

Market Research: The Ultimate Weapon for Business Success

June 22, 2023
Unveiling the Hidden Power of Market Research A Game Changer
Consumer Research

Unveiling the Hidden Power of Market Research: A Game Changer

June 2, 2023
7 Secrets of Market Research Gurus That Will Blow Your Mind
Consumer Research

7 Secrets of Market Research Gurus That Will Blow Your Mind

May 8, 2023
The Shocking Truth About Market Research Revealed!
Consumer Research

The Shocking Truth About Market Research: Revealed!

April 25, 2023
market research, primary research, secondary research, market research trends, market research news,
Consumer Research

Quantitative vs. Qualitative Research. How to choose the Right Research Method for Your Business Needs

March 14, 2023
Next Post
College Football Playoff rankings predictions: Top 25 projections for Week 12

College Football Playoff rankings predictions: Top 25 projections for Week 12

Categories

  • Consumer Research
  • Data Analysis
  • Data Collection
  • Industry Research
  • Latest News
  • Market Insights
  • Marketing Research
  • Survey Research
  • Uncategorized

Recent Posts

  • Ipsos Revolutionizes the Global Market Research Landscape
  • How Machine Learning has impacted Consumer Behaviour and Analysis
  • Market Research: The Ultimate Weapon for Business Success
  • Privacy Policy
  • Terms of Use
  • Antispam
  • DMCA

Copyright © 2024 Globalresearchsyndicate.com

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT
No Result
View All Result
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights

Copyright © 2024 Globalresearchsyndicate.com