Please submit materials in

About the competition

SADET solicits training modules on topics that ranges from simple explanations that help to provide a common sense context to highly technical presentations of data science and cybersecurity technologies.


For more information, please visit the Project and Workshop pages.

Submission requirements

Each submitted module should meet the following requirements:

  • The submitted module should match the description of one of the 19 modules outlined below.
  • The module should provide enough instructional material to provide for 2-3 hours of classroom instruction. Materials that address technical aspects in ways that engage high school students are highly sought. Preference will be given to modules that include video segments, powerpoint slides and notes, games and simulations – apps and computer versions are desired. The instructional modules should provide other teachers with enough material to allow them to implement them in their own classes. The submission should include a homework assignment consisting of at least 3 questions and a test/quiz consisting of at least 10 questions with answer key.
  • An optional Lab or mini project related to the lecture material may be submitted as part of the module. However, this component is entirely optional.

All the above materials may be provided as a single zip file. The recorded lecture may be uploaded as a Youtube video and the module submission may include the link to the video. Materials and video link for the modules should be emailed to Ruth Mihalyi (RMM151@pitt.edu), Balaji Palanisamy (bpalan@pitt.edu) and Konstantinos Pelechrinis (kpele@pitt.edu) by the submission deadline.


Participants of the materials competition are strongly encouraged to send a 1-page description of the lecture content for the intended module and seek feedback from the SADET project investigators on the suitability of the proposed lecture content for the module. A participant may submit any number of modules for the competition.


Modules will be judged based on accuracy and correctness of presentation, suitability for use in postsecondary instruction, and creativity and artistic composition. Each submitted module will be evaluated and classed as suitable or unsuitable for distribution in the SADET project website. For all submissions judged suitable for distribution, the modules will be ranked and the top three faculty developers will receive payments of $4000, $3000 and $2000 respectively. The 4th and 5th ranked faculty developers will receive a honorable mention of $500 each.

Important dates

  • Announcement of the Competition: April 30th, 2018
  • 1-page summary of the proposed module (Strong Recommended but not required): January 1, 2019
  • Submission of the materials for the proposed module: February 15, 2019
  • Announcement of the results: April 15, 2019

SADET Modules

The content of SADET modules will range from simple explanations that help to provide a common sense context to highly technical presentations of data science and cybersecurity technologies. The modules are divided into three groups:


  • Digital Literacy modules
  • Introductory Data Science modules
  • Conceptual Data Security Modules
  • Technical Data Security and Data Science Modules

The Security-assured Data Science Materials competition for high school teachers requires the development of instances of the training materials for the modules including lecture materials, home work assignments, tests and projects which could be used in classroom teaching and assesment.


Module Group Module Name Example topics
Digital Literacy Modules Module A.1: Digital Data Safety Avoiding scams, keeping data safe, safe digital transactions
Module A.2: Taking Reasonable precautions (Digital security): Awareness of data thefts through malware, phishing attacks, use of firewalls
Module A.3: Protecting Digital Data Rights: Rights to privacy, intellectual property, freedom of speech and protection from hate speech.
Introductory Data Science Modules Module B.1: Evidence-based Thinking Thinking with data, benefits of data, difference between theory and data
Module B.2: Basic Statistical Concepts Descriptive statistics and their pitfalls, numerical precision vs. significance, hypothesis testing,
Module B.3: The Notion of Uncertainty Probabilistic inference, sources of errors
Conceptual Data Security Modules Module C.1: The Mind of the Hacker Attacks on individual's data and data of businesses, power of a hacker vs. a thief
Module C.2: Data Security Basics Confidentiality, Integrity, and Availability Concepts and examples related to confidentiality, integrity and availability principles of security.
Module C.3: Vulnerabilities and Threats to Data Real world examples related to data security vulnerabilities and threats, concepts related to mitigation.
Module C.4: Data Privacy and the Law Privacy regulations such as HIPPA, FERPA and COPPA and their limitations.
Technical Data Security and Data Science Modules Module D.1: Protecting Data Confidentiality using Cryptography -I Design and security strength of Shift cipher, substitution and transposition ciphers, brute force attacks
Module D.2: Protecting Data Confidentiality using Cryptography - II Symmetric key encryption, DES, AES, introduction to public key cryptography
Module D.3: Protecting Data using Role Based Access Control Unix Access control commands, introduction to RBAC, ANSI RBAC standard, and FIPS standard, and X.509 standards
Modulc D.4: Data Mining techniques Prediction, classification and clustering techniques and the algorithms appropriate for addressing these tasks
Module D.5: Data Privacy Fundamental concepts in data privacy such as k-anonymity and differential privacy
Module D.6: Privacy-aware Data Mining Privacy-preserving classifiers, clustering anonymized data and differential privacy-aware prediction.
Module D.7: Securing Data from Insider threats Insider threats in an organization and common approaches to deter insider attacks.
Module D.8: Email Classification and Spam Detection Spam detection, Phishing detection.
Module D.9: Network Intrusion Detection Signature-based solutions, supervised and unsupervised data mining solutions, deep packet inspection

Digital Literacy Modules for Cybersecurity

Module A.1: Digital Data Safety: This module is intended to provide a basic introduction to safety in the online world. Just as one takes care in the physical world to provide for personal safety, there are things every person should know about life in the digital world. In the physical world, we protect our money and take steps to insure that we don’t put ourselves in risky situations. Some of the things we plan to do in this respect include: Avoid getting sucked into scams where “it looks too good to be true.” Avoid getting something for free – just give us a little info -- don’t give out personal information. Obviously, one wants to be careful about giving away one’s credit card number. Not so clear how you want to protect information about where you live, how you walk to school, when you are on vacation, what your “codes” may be, how old you are, etc. Keep your money safe – who do you trust with your credit card? Stay away from places where there are “lurkers”. Create awareness of cyberbullying, radicalization, and seduction.

Module A.2: Taking Reasonable precautions (Digital security): This module will introduce precautionary measures in the context of digital security. Some of the ways you secure your environment are actually very hard to implement. However, some of the things that can be done are pretty simple. Topics and concepts covered in this module will include: (i) Don’t accept software from people you don’t know – it could be infected., (ii) Malware defined, places where you can be more trustful of software and content you download. (iii) Use steganography as an example of how you can hide something in content. (iv) Define malware and explain how it can be detected. (v) Behavioral protection- explain how phishing and spear fishing work and how to avoid them., (vi) What are the best practices and to use suitable security tools for data protection – how do you implement firewalls, security checks for viruses and worms. etc..

Module A.3: Protecting Digital Data Rights: The digital world is a great place for communication with family and friends. This module will educate students that at each level of technology use, we invite surveillance and are tempted to violate the rights of others. Consider the differences between a secret whispered in a closed room, one written down in a letter, one uttered on a phone call, one sent via text messaging, and one written in an email. Websites holds a lot of information about you and your activities – as might be seen by a lover, someone who is not your friend, your parents, your employer, or the government. This module will also create awareness of temptation to copy information from a source in a book or on the web and present it as your own. We will discuss the rights to privacy, intellectual property, freedom of speech and protection from hate speech.


Introductory Data Science Modules

Module B.1: Evidence-based Thinking: This module is intended to provide an introduction to evidence-based, scientific, thinking. Its objective is to make a clear distinguish between “common sense/practice” thinking and factual thinking. Topics and concepts to be covered in this module will include: (i) collecting the right type of data, (ii) asking the right questions from the data, (iii) interpreting clearly the results of the data analysis, (iv) the trifecta of “test, learn, adopt”. This module will be based on specific case-studies that will showcase the different conclusions that will be reached through a “common-sense” line of thought and the data-based process.

Module B.2: Basic Statistical Concepts In this module we will introduce at a high level the basic concepts of statistical analysis. We will present different types of exploratory analysis through descriptive statistics, while we will pinpoint on potential problems that can arise from them. In particular, we will discuss the notion of robust statistics by utilizing the example of mean versus median. We will also introduce the concept of statistical testing through specific security-based case-studies. Another possible problem with a novice data analyst is the possible confusion between precision and significance. It is very common numbers with high precision to be treated as more “accurate” and significant as compared to numbers/results with less precision.

Module B.3: The Notion of Uncertainty: Whenever making decisions based on data one should be clear that the decisions made are the most-educated decisions, rather than the guaranteed optimal. Every data analysis is associated with an uncertainty and most of the inferences are probabilistic, i.e., they assign a probability to events. However, what an analyst needs to be clear about is that the fact that an event has low probability does not mean that it is improbable all together. In this module, we will introduce the notion of uncertainty and probabilistic inference. We will discuss the various sources of errors that lead to this uncertainty (e.g., un-modeled variance of the dependent variables, inaccurate data etc.).


Conceptual Data Security Modules

Module C.1: The Mind of the Hacker: To understand how cyber data theft is different, it is useful to think about cyber and traditional crime. Hackers doesn’t have to pick a lock, they can simply use a smart tool they can download from the network. Further, to attack the data of an individual or an organization, they don’t have to go there, they can do it from the comfort of their own bedroom. Finally, they can attack 1,000,000 houses form one PC and it can be done with incredible speed. This conceptual module provides a basic understanding of the mindset of the hackers and educate the potential of cybercrimes leading to data thefts compared to traditional crimes.

Module C.2: Data Security Basics: Confidentiality, Integrity, and Availability: In this module, we will cover the basic concepts to understand the goals of data security. Theoretically it is referred to as CIA – Confidentiality, Integrity, and Availability. Working backwards, it is important to insure that data and information you might want to use is available to you. Imagine going to the hospital and discovering that the doctor can’t access the information he or she needs about you to provide some essential service. That would be unthinkable. Also, consider a situation where you could never be sure that the bank knew how much money was in your checking account – if you can’t trust the integrity of information you are given, you can’t rely on it. Finally, we expect that some information about us will be kept confidentially – such as our medical records. The goal of this module is to make students understand the fundamental goals of cybersecurity. Students will be provided with real world examples of confidentiality, integrity and availability attacks in cybersecurity context. Various forms of confidentiality and integrity threats such as deception, disclosure, usurpation and disruption will be covered.

Module C.3: Vulnerabilities and Threats to Data: As we think about data security, we can think about it in three ways. Again, consider something simple as an example. Let’s imagine that our home has two doors and 10 windows – five on the first floor and five on the second floor. Further, imagine one door has two locks, one being a deadbolt, and the other door has an old fashioned skeleton key lock. Further, imagine the five first floor windows have locks but the five second floor windows do not. This analysis would help us to understand whether our home was vulnerable. Now, it might be the case that this home is in a walled compound that is patrolled regularly, or it might be that it is. Turning to mitigation, we might begin to examine what steps we might take, given the vulnerabilities and the threats, to mitigate attacks on digital data. This module will cover the fundamental concepts of threats, vulnerabilities and defense mechanisms in the context of digital data security. It will expose various types of security threats and vulnerabilities and provide an understanding of common defense and protection mechanisms. The objective is to recognize, understand and evaluate various security threats and challenges in computer systems. Students will be required to read articles related to recent data breaches and will be required to demonstrate understanding of threats and vulnerabilities in the context of the chosen articles.

Module C.4: Data Privacy and the Law: This module addresses two distinct subtopics. The first traces the history of US law related to privacy. It begins with Brandeis and Warren (1890) and Westin’s 1967 extensions related to disclosure of private information. It will cover web privacy efforts as well as government directives such as the OMB Privacy Impact Assessment. Finally the US and EU approaches to privacy will be contrasted and sector regulations such as HIPPA, FERPA and COPPA will be explored.


Technical Data Security and Data Science Modules

Module D.1: Protecting Data Confidentiality using Cryptography –I: A basic introduction to data protection using cryptography will be covered in this module. The module will be designed to introduce the concepts of simple substitution and transposition ciphers including shift ciphers, columnar transpositions. In this module, students will learn the design of simple encryption schemes and will be exposed to the vulnerabilities and the security weaknesses of simple encryption techniques such as shift ciphers. The module presentation will include a demo application in Java to visually illustrate the concept behind substitution and transposition ciphers. The learning objective is to understand and evaluate security strengths of simple substitution and transposition-based encryption schemes.

Module D.2: Protecting Data Confidentiality using Cryptography – II: This module will cover concepts related to symmetric key encryption and provide an overview of public-key encryption. A high level overview of the properties of well-known symmetric key and public key encryption schemes such as DES, AES and RSA will be introduced. Students will be exposed to key applications of public key cryptography such as Digital Signatures and secure key exchange protocols. FIPS 186 Digital Signature Standard (DSS), and FIPS 197 Advanced Encryption Standard (AES) will be described and discussed. We will also overview the FIPS 140 standard to discuss the requirements for a cryptographic module implemented within the federal computer systems. The use and design of X.509 PKI certificates will also be discussed. Students will understand and explain the difference between symmetric key cryptography and public key cryptography and explain how digital signatures are produced and verified using public key cryptographic techniques.

Module D.3: Protecting Data using Role Based Access Control: This module will discuss protecting data confidentiality and security using access control methods. It will cover basic access control mechanisms in general purpose operating systems and provide an introduction to Role Based Access Control. The ANSI RBAC standard, and FIPS standard, and X.509 standards will be introduced and explained. For ANSI RBAC, we will discuss timeline for its development initiated by the NIST RBAC model. Students will recognize the basic access control mechanism in Unix OS. Learn to use access control commands to control permissions in the Unix Operating System.

Module D.4: Data Mining Techniques: This module will focus on the statistical objectives of prediction, classification and clustering and in the contexts where datasets are large. Students will be introduced with classical statistical methodology of linear regression and will obtain an introduction to data mining methods from a statistical perspective. Students will learn modern non-linear methods such as spline methods, generalized additive models, decision trees, support vector machines and an overview of advanced linear approaches, such as LASSO, linear discriminant analysis, k-means clustering, nearest neighbours, and neural networks.

Module D.5: Data Privacy: In this module, issues related to data privacy will be introduced. Students will learn various forms of privacy protection mechanisms including data anonymization and data sanitization techniques. The presentation will show the vulnerabilities of naïve sanitization processes and demonstrate the significance of rigorous data privacy models such as k-anonymity and differential privacy. Applications of data privacy protection mechanisms for health care data and financial data will be covered and live demonstrations of data anonymization and differential privacy-based data perturbation will be presented to students.

Module D.6: Privacy-aware Data Mining: This module will introduce the challenges of privacy-preserving data analysis and data mining. Students will be exposed to the utility tradeoffs of anonymization and differential privacy-based data perturbation and analyze the impact on the accuracy of the prediction techniques and classifiers operating on perturbed/anonymized data. Students will learn optimized data perturbation algorithms for k-means clustering, support vector machines, nearest neighbor processing that yield higher accuracy on perturbed anonymized and differentially private data.

Module D.7: Securing Data from Insider threats: This module focuses on insider attacks that are carried out by individuals who are legitimately authorized to access data and other confidential resources in the system. Preventing insider attacks is a daunting task. The module will cover the ways in which people willingly or unwillingly provide access as insiders to information in an organization. Students, in this module, will learn mechanisms to understand the current context and historic behavior of users through application of data mining techniques to gauge the trust-worthiness of users based on behavioral patterns. Policy constraints to manage the risks of suspicious users will be used to assess the risk and trust-worthiness before allowing access.

Module D.8: E-mail Classification and Spam detection: In this module basic machine learning and data mining techniques for automated e-mail classification will be discussed. E-mail spam detection will be the main focus of the module, while phising attacks through e-mail will also be examined. The module will cover the data-related methods that have been used to detect spam e-mails and how a spam filter operates. We will return to the notion of uncertainty introduced in Module B.3 and obtain a better understanding on why spam e-mails pass the filter, while non-spam e-mails are sent to the “spam folder”. Appropriate evaluation metrics for spam detections algorithms will also be introduced.

Module D.9: Network Intrusion Detection: In this module data mining techniques for automated network intrusion detection will be introduced. We will begin with signature-based methods, and then we will discuss methods for detecting attacks for which no-signatures exist. We will discuss the notion of monitoring, sampling and real-time analysis. We will also explain that there is not a one-size-fits-all solution when it comes to security and network intrusion detection in particular. Students in this module will experiment with pre-implemented algorithms for network intrusion detection by simulating network traces involving attacks.

Sponsors & Partners

This competition is part of the project supported by the National Science Foundation under Grant No. 1730286, entitled CyberTraining: CDL: Security-Assured Data Science Workforce Development in Pennsylvania.

SADET Project Team

Dr. Palanisamy

Balaji
Palanisamy

Ph.D.

Associate Professor, Principal Investigator

Dr. Palanisamy

Konstantinos
Pelechrinis

Ph.D.

Associate Professor

Dr. Joshi

James B.D.
Joshi

Ph.D.

Professor

Brain

Brian
Stengel

IT Service Owner

at University of Pittsburgh

Dr. Spring

Michael B.
Spring

Ph.D.

Associate Professor

Dr. Shrestha

Prashant K.
Krishnamurthy

Ph.D.

Professor

Dr. Tipper

David
Tipper

Ph.D.

Professor

Dr. David

David
Thaw

Ph.D.

Assistant Professor

Dr. Perkoski

Robert
Perkoski

Ed.D.

Assistant Professor