Why Graphs Are Great For Security

Speed. Intuition. Learning.

Investigating a security risk can involve wrangling dozens of data sources. To uncover a successful phishing attack, you might have to integrate email records, process executions, user behavior, file access, netflow, antivirus alerts, etc. Gathering and correlating those logs is slow and cumbersome, making it difficult to follow the attack chain.

Sift Security’s product extracts the most important information from disparate data sources into one straightforward, scalable system: a relational graph. Using the graph makes investigations faster, easier, and more intuitive. The graph also provides context to our machine learning platform, enabling it to discover and prioritize relevant anomalous activity.

Sift Security’s relational graph is a data structure with two parts:

The graph represents real relationships extracted from logs. At the same time we ingest a log into the index, we extract the relationships between the entities contained within.

For example, consider the following log entry:

Microsoft Windows audit log entry:

{ “EventTime”: “2016-02-15 20:18:49”, “Hostname”: “”, “Keywords”: -9214364837600035000, “EventType”: “AUDIT_SUCCESS”, “SeverityValue”: 2, “Severity”: “INFO”, “EventID”: 4624, “SourceName”: “Microsoft- Windows-Security-Auditing”, “ProviderGuid”: “{54849625-5478-4994-A5BA-3E3B0328C30D}”, “Version”: 2, “Task”: 12544, “OpcodeValue”: 0, “RecordNumber”: 494242, “ActivityID”: “{B0DAF283-66D6-0003-8EF2-DAB0D666D101}”, “ProcessID”: 744, “ThreadID”: 1824, “Channel”: “Security”, “Category”: “Logon”, “Opcode”: “Info”, “SubjectUserSid”: “S-1-5-18”, “SubjectUserName”: “WS02$”, “SubjectDomainName”: “WORKGROUP”, “SubjectLogonId”: “0x3e7”, “TargetUserSid”: “S-1-5-21-2257405197-1984329406-3649188883-1001”, “TargetUserName”: “robert”, “TargetDomainName”: “”, “TargetLogonId”: “0x185ed31”, “LogonType”: “2”, “LogonProcessName”: “User32 “, “AuthenticationPackageName”: “Negotiate”, “WorkstationName”: “WS02”, “LogonGuid”: “{00000000-0000- 0000-0000-000000000000}”, “TransmittedServices”: “-”, “LmPackageName”: “-”, “KeyLength”: “0”, “ProcessName”: “C:\\Windows\\System32\\svchost.exe”, “IpAddress”: “”, “IpPort”: “0”, “ImpersonationLevel”: “%%1833”, “RestrictedAdminMode”: “-”, “TargetOutboundUserName”: “-”, “TargetOutboundDomainName”: “-”, “VirtualAccount”: “%%1843”, “TargetLinkedLogonId”: “0x185ed4f”, “ElevatedToken”: “%%1842”, “EventReceivedTime”: “2016-02-26 12:03:02”, “SourceModuleName”: “windows”, “SourceModuleType”: “im_msvistalog” }

This log entry shows a successful logon attempt, which can be represented graphically as follows. The extra data in the log entry above are stored as attributes on the edges and vertices where appropriate. For example, here that fact that the authentication attempt was successful will be stored as an attribute on the authentication edge.

Once the mapping from raw logs to edges and vertices is created, the graph is constructed automatically during ingestion. The graph representation of each log entry is created and merged into a larger graph containing all of the log entries. Correlations between different data sources happen naturally, as common entities extracted from the sources are merged together. The graph becomes an easy-to-use repository of all of the log data, enabling even novice analysts to pivot seamlessly between data sources. Sift Security supports many common log formats out-of-the-box, and includes the tooling necessary to create the graph mapping for new data sources.