RoMa aims to provide mechanisms to improve the security of ML in the application projects. Roma will interact with other technology projects regarding attack and solution models. The objective of RoMa is to increase the robustness of neural networks and other ML algorithms against attacks altering input data during testing phase either to evade correct classification or to enforce a wanted classification.
SePIA is an application project addressing various challenges in Automated Open Source Intelligence. Objectives are the encapsulation of the OSINT process in a secure environment following privacy by design, and the application of advanced crawling and information gathering concepts for automation of searching available data sources including utilizing ML for improving the state of the art in crawling. Further, SePIA deals with improving data cleansing by adding a feedback loop to crawling and analysis modules and improving the analysis methods for automated intelligence results based on ML.
Adversarial Attacks on NLP systems focuses on a second challenge in ML/KI security where KI systems are utilized as attackers. The focus here (NLP) is on textual data. This can be used in the SePIA project, as OSINT is often based on textual data. The project addresses hate speech and disinformation, which are relevant scenarios in OSINT applications.
This project addresses the important aspects of transparency as well as explainable results and nets in ML. The aim is to build a software toolbox for explainable ML, also increasing other security aspects of the algorithms. A robotic environment is used as an example.
The goal of this project is to explore Natural Language Processing methods that can dynamically identify and obfuscate sensitive information in texts, with a focus on implicit attributes, for example, their ethnic background, income range, or personality traits. These methods will help to preserve the privacy of all individuals - both authors as well as other persons mentioned in the text. Further, we go beyond specific text sources, like social media, and aim to develop robust and highly adaptable methods that can generalize across domains and registers. Our research program encompasses three areas. First, we will extend the theoretical framework of differential privacy to our implicit text obfuscation scenario. The set of research questions includes fundamental privacy questions related to textual datasets.
Second, we will identify to which extent unsupervised pre-training achieves domain-agnostic privatization. Third, the large gap between formal guarantees and meaningful privacy-preservation capabilities is due to a mismatch between the theoretical bounds and existing evaluation techniques based on attacking the systems.