SAFEWORDS
Ethical, Privacy-Preserving and Trustworthy Language Technologies within HumanAIze
Background and Rationale
The rapid evolution of Large Language Models (LLMs) has transformed natural language processing and generative AI applications. However, their development and deployment raise critical challenges related to data protection, bias amplification, explainability, and regulatory compliance. The European Union has positioned itself at the forefront of trustworthy AI through the GDPR and the AI Act, establishing a robust regulatory framework that demands accountability, transparency, and protection of fundamental rights.
SAFEWORDS is a strategic component of the HumanAIze project, a coordinated initiative aimed at building next-generation human-centred, multilingual and trustworthy LLMs for Europe. SAFEWORDS provides the ethical, legal, and data governance infrastructure that ensures full alignment between technological innovation and European values.
Objectives
Develop a comprehensive data governance framework for multilingual and multimodal AI training.
Design and validate privacy-preserving anonymisation pipelines compliant with GDPR and AI Act requirements
Establish methodologies for bias detection and mitigation across datasets and models.
Ensure the integration of legal, ethical, and sustainability principles throughout the AI lifecycle.
Strengthen transparency, explainability, and accountability in LLM development.
Expected Impact
- Implementation of the AI Act and GDPR in advanced AI development
- Reduction of legal and ethical risks in generative AI
- Increased public trust in AI technologies
- Sustainable and energy-efficient model development
- Strategic European autonomy in language technologies
Approach
SAFEWORDS operates across the HumanAIze architecture:
In WP2, it codifies ethical, legal, and sustainability guidelines governing AI development.
In WP3, it leads privacy-preserving dataset curation, anonymisation, and fairness evaluation.
In collaboration with WP4 and WP5, it ensures that base models and aligned models comply with governance constraints and European regulatory standards.
In WP6, it contributes to evaluation and real-world validation, particularly in administrative-legal and biomedical domains.
The project integrates advanced anonymisation techniques, PII detection systems, bias auditing tools, and benchmark datasets with human annotations to evaluate factuality, safety, and non-discrimination.
Innovation
SAFEWORDS moves beyond compliance-by-design by operationalising regulation into technical protocols, measurable indicators, and enforceable governance mechanisms. It combines:
Legal and AI expertise
Privacy-preserving technologies
Fairness auditing methodologies
Sustainability-aware training strategies
This interdisciplinary integration ensures scalable and responsible AI infrastructures for multilingual European contexts.
European Added Value
By embedding governance, privacy and fairness at the core of AI innovation, SAFEWORDS ensures that HumanAIze delivers LLMs that are not only technically advanced, but also aligned with democratic values, fundamental rights, and the European digital strategy.
SAFEWORDS exemplifies Europe’s model of innovation: competitive, ethical, transparent and socially responsible.