thu-coai

All

94 repositories

Crisp
Public
Crisp: Cognitive Restructuring of Negative Thoughts through Multi-turn Supportive Dialogues
Python
•0•5•0•0•Updated Apr 27, 2025Apr 27, 2025
CharacterBench
Public
[AAAI'25] CharacterBench: Benchmarking Character Customization of Large Language Models
Python
•0•6•0•0•Updated Apr 25, 2025Apr 25, 2025
AISafetyLab
Public
AISafetyLab: A comprehensive framework covering safety attack, defense, evaluation and paper list.
Python
•
MIT License
•7•158•1•0•Updated Apr 5, 2025Apr 5, 2025
VPO
Public
Python
•
Apache License 2.0
•1•7•0•0•Updated Mar 26, 2025Mar 26, 2025
MAPS
Public
Official Implementation of ICLR25 paper "MAPS: Advancing Multi-modal Reasoning in Expert-level Physical Science"
Python
•1•2•0•0•Updated Mar 12, 2025Mar 12, 2025
TransferAttack
Public
Python
•0•7•0•0•Updated Mar 5, 2025Mar 5, 2025
LongSafety
Public
Python
•
MIT License
•0•11•0•0•Updated Feb 26, 2025Feb 26, 2025
Agent-SafetyBench
Public
Python
•1•28•0•0•Updated Feb 20, 2025Feb 20, 2025
ComplexBench
Public
Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
Python
•
MIT License
•9•81•4•0•Updated Feb 20, 2025Feb 20, 2025
CharacterGLM-6B
Public
[EMNLP'24] CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models
Python
•
Apache License 2.0
•36•462•2•0•Updated Jan 7, 2025Jan 7, 2025
SPaR
Public
Python
•
Apache License 2.0
•3•47•1•0•Updated Dec 17, 2024Dec 17, 2024
MiniPLM
Public
[ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models
Python
•
MIT License
•8•42•4•0•Updated Nov 23, 2024Nov 23, 2024
MoralStory
Public
Python
•0•17•1•0•Updated Nov 7, 2024Nov 7, 2024
OpenMEVA
Public
Benchmark for evaluating open-ended generation
benchmark evaluation-metrics language-generation
Python
•7•48•3•1•Updated Nov 6, 2024Nov 6, 2024
CodePlan
Public
1•16•1•0•Updated Oct 16, 2024Oct 16, 2024
ShieldLM
Public
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]
Python
•
MIT License
•8•188•1•0•Updated Sep 29, 2024Sep 29, 2024
PICL
Public
Code for ACL2023 paper: Pre-Training to Learn in Context
Python
•
MIT License
•4•108•1•1•Updated Jul 26, 2024Jul 26, 2024
PsyQA
Public
一个中文心理健康支持问答数据集，提供了丰富的援助策略标注。可用于生成富有援助策略的长咨询文本。
17•205•0•0•Updated Jul 21, 2024Jul 21, 2024
JailbreakDefense_GoalPriority
Public
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
Python
•1•22•0•0•Updated Jul 9, 2024Jul 9, 2024
SafeUnlearning
Public
Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
Python
•1•25•3•0•Updated Jul 9, 2024Jul 9, 2024
CritiqueLLM
Public
Python
•3•143•6•0•Updated Jul 1, 2024Jul 1, 2024
AutoDetect
Public
Official github repo for AutoDetect, an automated weakness detection framework for LLMs.
Python
•
MIT License
•1•42•0•0•Updated Jun 25, 2024Jun 25, 2024
BPO
Public
Python
•
Apache License 2.0
•15•318•1•0•Updated Jun 24, 2024Jun 24, 2024
SafetyBench
Public
Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]
Python
•
MIT License
•9•218•4•1•Updated Jun 24, 2024Jun 24, 2024
Emotional-Support-Conversation
Public
Data and codes for ACL 2021 paper: Towards Emotional Support Dialog Systems
Python
•
Other
•36•259•3•0•Updated Jun 19, 2024Jun 19, 2024
CrossWOZ
Public
A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
Python
•
Apache License 2.0
•118•684•3•1•Updated Jun 17, 2024Jun 17, 2024
ConvLab-2
Public
ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
dialogue-systems task-oriented-dialogue dialogue
Python
•
Apache License 2.0
•134•462•13•1•Updated Jun 17, 2024Jun 17, 2024
Safety-Prompts
Public
Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts，用于评估和提升大模型的安全性。
prompt safety instruction attack-defense chinese-language llm prompt-engineering chatgpt
Apache License 2.0
•84•1k•1•0•Updated Feb 27, 2024Feb 27, 2024
Implicit-Toxicity
Public
Official Code for EMNLP 2023 paper: "Unveiling the Implicit Toxicity in Large Language Models""
Python
•0•12•0•0•Updated Nov 30, 2023Nov 30, 2023
Re3Dial
Public
Official Code for EMNLP 2023 paper: "Re3Dial: Retrieve, Reorganize and Rescale Conversations for Long-Turn Open-Domain Dialogue Pre-training"
Python
•0•6•1•0•Updated Oct 22, 2023Oct 22, 2023