This page contains tutorials, analyses and challenges done on IT & Cybersecurity datasets using Squey.
This article is a step-by-step tutorial aiming at loading and analyzing the bluecoat_proxy_big.zip dataset from Public Security Log Sharing in Squey.
The first section is devoted the creation of a parsing file that will allow us to load the dataset. Should you be in a hurry, you can skip straight to the analysis as the file is provided below.
Before being able to load the dataset into the application, a parsing file (called format
) should be created using the Format Builder tool.
In this article, we are addressing the challenge presented by detecteam.com.
โWe have published one year of ssh logins/logouts of a valid administrator; However the account has been compromised using social engineering similar to the MGM attack which led to a ransomware being deployed.โ โ Detecteam
So here is the openssh.log_.zip (mirror) dataset and its associated openssh.log_.zip.format parsing file.
It’s looking like typical OpenSSH logs:
Sep 24 08:46:18 bidizidomo sshd[26168]: Accepted password for iworkinacasino from 173.194.42.31 port 63346 ssh2
Sep 24 08:46:18 bidizidomo sshd[26168]: pam_unix(sshd:session): session opened for user iworkinacasino(uid=1169) by (uid=0)
Sep 24 08:46:18 bidizidomo systemd-logind[515]: New session 8767 of user iworkinacasino.
Sep 24 08:46:18 bidizidomo sshd[26168]: pam_env(sshd:session): deprecated reading of user environment enabled
Sep 24 12:51:42 bidizidomo sshd[26971]: Received disconnect from 173.194.42.31 port 23568 disconnected by user
Sep 24 12:51:42 bidizidomo sshd[26971]: Disconnected from user iworkinacasino 173.194.42.31 port 23568
Sep 24 12:51:42 bidizidomo sshd[26971]: pam_unix(sshd:session): session closed for user iworkinacasino
Sep 24 14:32:41 bidizidomo sshd[44186]: Accepted password for iworkinacasino from 173.194.42.109 port 26603 ssh2
Sep 24 14:32:41 bidizidomo sshd[44186]: pam_unix(sshd:session): session opened for user iworkinacasino(uid=1169) by (uid=0)
...
During the data ingest step, the dataset was enriched in several ways:
This article aims at solving the PCAP related questions from the DFIR MONTEREY 2015 Network Forensics Challenge using Squey.
Of course the idea here is not to really solve the challenge as it has been solved numerous times since then, but to see how easier it is to solve it using Squey.
The dataset 2014-11+DFIR+Network+Forensics+Challenge.zip was taken from the Netresec PCAP page.
Note: questions 1 and 4 were not solved because they didn’t involve any PCAP data.
You’re not sure what data your packet capture is really containing and it is too big to be opened with Wireshark or other tools? Visualize it using Squey, isolate packets or sessions worth of interest with arbitrary complexe criteria and then export it to smaller PCAP file(s).
As an example, we will load the complete MACCDC 2012 PCAP dataset composed of 17 files (~17GB) and export HTTP communications between IPs 192.168.203.63 and 192.168.229.101 on port 80.
Following a really small and easy challenge published on PentesterAcademy blog focused on the MACCDC 2012 DNS dataset analysed with ELK, we thought it could be an great exercice to guide you solving it using Squey.
Click on the Local files...
button located on the SOURCES
section of the start page and browse the compressed dataset.
The file format and column types will be automatically detected, so just click Yes
and Save
.
Here is a video of the detection of a successful phishing attack contained in a 10 million rows anonymized proxy logs.
Since version 5.0, Squey is able to import and export Apache Parquet files!
ยซ
Apache Parquet is an open-source file format that stores data efficiently in columnar format, provides different encoding types, and supports predicate filtering. With good compression ratios and efficient encoding, VPC flow logs stored in Parquet reduce your Amazon S3 storage costs.
ยป โ AWS Blog
Let’s take advantage of the fact that AWS VPC Flow Logs can be natively stored in Apache Parquet format to seamlessly visualize our network and understand traffic patterns, identify security issues, audit usage, and diagnose network connectivity.