Comprehensive
Data Analysis
Gain hands-on experience with vectorized computing, data frame management, and creating both static and interactive visualizations, essential for data interpretation and presentation.
We made this course to teach cybersecurity professionals how to use AI/ML to defend their organizations. Additionally, we want cybersecurity professionals to understand how to attack artificial intelligence applications and what are the associated risks.
This course is being developed and will release toward end of Sept 2024. Get on the waitlist until then to know more and be informed of giveaways and early bird rates!
This interactive course will teach security professionals how to use data science techniques to quickly manipulate and analyze network and security data and ultimately uncover valuable insights from this data. The course will cover the entire data science process from data preparation, feature engineering, and selection, exploratory data analysis, data visualization, machine learning, model evaluation and optimization and finally, implementing at scale—all with a focus on security-related problems.
Participants will learn how to read in data in a variety of common formats and then write scripts to analyze and visualize that data. A non-exhaustive list of what will be covered include:
Using machine learning and AI to detect network attacks within your organization
Hunting anomalous indicators of compromise and reducing false positives
Quickly and efficiently parsing executables, log files, PCAP and extracting artifacts from them
Writing scripts to efficiently read and manipulate CSV, XML, and JSON files
Using the Pandas library to quickly manipulate tabular data
Preprocessing raw security data for machine learning and feature engineering
Building, applying, and evaluating machine learning algorithms to identify potential threats
Automating the process of tuning and optimizing machine learning models
Using supervised learning algorithms such as Random Forests, Naive Bayes, K-Nearest Neighbors (K-NN), and Support Vector Machines (SVM) to classify malicious URLs and identify SQL Injection
Applying unsupervised learning algorithms such as K-Means Clustering to detect anomalous behavior
Rapidly and effectively visualizing data using Python
Attacking and exploiting machine learning models using adversarial techniques
Use NLP to detect spam and social engineering attacks
Using the latest LLM models for security analysis
Attacking and defending LLM-based applications against prompt injection
Anyone who wishes to incorporate automated data analysis, artificial intelligence, machine learning and data science into their cybersecurity work should take this course and expect the following outcomes:
Use the Python data science ecosystem to rapidly prepare, explore and visualize cybersecurity data
Build and evaluate common machine learning models (both supervised and unsupervised) and apply these techniques to cybersecurity use cases
Develop unsupervised models to uncover anomalies and other patterns in cybersecurity data
Level Effect’s Cybersecurity Fundamentals courses starting with IT
GTK Cyber's Python Programming for Data Science and Cybersecurity
0-1+ years of professional experience in technology, preferably within Data Science
Hobbyists with a solid understanding of Data Science, or Cybersecurity, or IT and some Python + Jupyter knowledge or willing to learn it
Modules
Units
Labs
Name: Charles Givre (LinkedIn)
Currently: Head of Artificial Intelligence, Stealth Startup
Bio: Charles is the CEO and founder of DataDistillr, which is dedicated to making the world's data easy to use and query. Prior to founding DataDistillr, Charles worked as a data scientist in cyber for JP Morgan and Deutsche Bank. Mr. Givre has taught (and is teaching) security data science courses at Blackhat and is a sought-after instructor. Mr. Givre co-authored the O'Reilly book Learning Apache Drill and is the PMC Chair for the Apache Drill project.
Name: Curtis Lambert (LinkedIn)
Currently: Senior Data Scientist, Raytheon
Bio: Curtis has more than 15 years experience supporting cyber security missions for the U.S. DOD specializing in application of data science techniques to national security challenges across cyberspace. He holds multiple SANS certifications in cyber security and loves taking on challenges others say can't be solved. Curtis started his career journey as a heavy equipment mechanic in central California working on agricultural equipment. He spent 6 years in the U.S. Army as a linguist and data analyst before becoming a consultant with BAH where he spent 9 more years supporting a variety of national security missions. Curtis is a CISSP and holds multiple SANS certifications. He is a relentless pursuer of knowledge and constantly engages in self-education through books, videos, and courses.
Gain hands-on experience with vectorized computing, data frame management, and creating both static and interactive visualizations, essential for data interpretation and presentation.
Tailored for cybersecurity applications, including practical training on classifiers, clustering, anomaly detection, and deep learning, all framed within security contexts. Address the challenges of hacking machine learning models, equipping students with knowledge to protect AI systems.
Focus on the practical implications for cybersecurity and AI model hacking. Students explore neural networks, including CNNs and RNNs, learning to apply these to security tasks and understand how to safeguard against vulnerabilities in AI technologies.
Note - this content is not finalized and may be subject to change prior to release.
This module introduces the course, as well as key concepts of data science and their application to security.
Define Python
Introduce the data science process
Discuss case studies of the application of data science and machine learning to security
Introduce the Python data science ecosystem to include Jupyter Notebook and various Python modules
Introduce the Griffon VM
This module introduces the concept of vectorized computing, and how to create, manipulate and summarize one dimensional data using the Pandas module in Python. Additionally, we will cover basic statistical concepts.
Create a series object
Explore and summarize data within a Series
Understand and generate Tukey 5 number summaries, as well as calculate other common statistical measures
This module builds on the concepts taught in module 1, and introduces the students to the DataFrame: a two dimensional, vectorized data structure. This module also covers how to directly ingest security data into a Pandas DataFrame.
Create a DataFrame using the various read_ functions
Import various security data into dataframes
Flatten complex nested data
Join and merge data sets
Calculate aggregate statistics
Data visualization is a powerful technique to have in your analytic toolkit. This module will cover the theory of data visualization as well as the actual process and coding of creating effective visualizations.
Understand techniques about how to make effective data visualizations
Be familiar with various modules in Python to create visualizations
Be able to create both static and interactive visualizations
The Machine Learning modules walk the student through the machine learning process from beginning to end, starting with feature engineering and selection, model creation and ultimately evaluating and improving model performance.
Complete walkthrough of the machine learning process
This module introduces students to the concepts behind classifiers and various classification algorithms. Students will also learn to evaluate classifier performance.
Understand the functioning of a classifier and various classification algorithms including Support Vector Machines, Decision Trees, and Random Forest
Be able to create models using various classification algorithms in Python using Scikit-Learn
Be able to evaluate the performance of a model and tune its hyperparameters.
This module introduces students to unsupervised learning and how to apply it to security problems. The module mainly focuses on clustering and its uses and limitations.
Understand various distance measurement functions
Understand the concepts behind K-Means and DBSCAN
Be able to create clusterers and evaluate their performance
This module covers various techniques that can be used to detect anomalies in security data. This module will introduce the students to various approaches to anomaly detection, including forecasting, unsupervised machine learning and other statistical techniques.
Understand different approaches for detecting anomalies in security data
Be able to implement common unsupervised machine learning algorithms such as one-class support vector machines and isolation forests to detect anomalous data
Be able to perform anomaly detection using statistical metrics such as the Grubbs’ test
Understand the challenges associated with using Machine Learning to detect anomalies
Deep learning is one of the most exciting and new areas in machine learning. In this module, we will give the students a conceptual overview of deep learning and its application to security problems. Given the complexity of this topic, this module is intended to be a more conceptual introduction, not an in-depth technical course.
Understand the basic concepts behind deep neural nets (DNN)
Become familiar the Python deep learning ecosystem
Understand the concepts behind more advanced NN such as convolutional NNs
This module covers the topics of deep learning, neural networks, convolutional neural networks, recurrent neural networks, and their applications in cybersecurity. It also discusses the use of deep learning tools such as TensorFlow and Keras.
Gain understanding of deep learning concepts including neural networks, CNNs, and RNNs.
Learn how to apply deep learning techniques to cybersecurity tasks.
Acquire practical skills in using TensorFlow, Keras, and Word2Vec for implementing deep learning models and processing cybersecurity data.
One of the most interesting topics in security data science is the possibility that machine learning models can be hacked. This module covers the real-world implications of hacking machine learning models, and the techniques used to hack.
Understand the risks of deploying ML models
Understand the techniques used to hack ML models
Understand ways of minimizing risk to a production model
Complete a course assessment to earn a certificate of completion!
The price will be released closer to the release date! Expect it to be within a range of $400 - $800
Python for Data Analysis
Data Science for Business
Creating a Data-Driven Organization
Data-Driven Security
Mastering Machine Learning with scikit-learn
Hands-On Machine Learning with Scikit-Learn and TensorFlow
Deep Learning
It would be beneficial for participants to be comfortable with the basics of Python programming, but it is not required to in order to take this course.
Anaconda
TensorFlow (and supporting libraries)
Numpy
Scikit-learn
YellowBrick
Seaborn
Pandas Profiling
Matplotlib
VMWare Workstation/Player/Fusion