[go: up one dir, main page]

Skip to content

A Streamlit app for exploring content moderation with Llama Guard on Groq.

License

Notifications You must be signed in to change notification settings

alphasecio/llama-guard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llama-guard

Llama Guard is an LLM-based input-output safeguard model geared towards Human-AI conversation use cases. If the input is determined to be safe, the response will be Safe. Else, the response will be Unsafe, followed by one or more of the violating categories:

  • S1: Violent Crimes.
  • S2: Non-Violent Crimes.
  • S3: Sex Crimes.
  • S4: Child Sexual Exploitation.
  • S5: Defamation.
  • S6: Specialized Advice.
  • S7: Privacy.
  • S8: Intellectual Property.
  • S9: Indiscriminate Weapons.
  • S10: Hate.
  • S11: Suicide & Self-Harm.
  • S12: Sexual Content.
  • S13: Elections.
  • S14: Code Interpreter Abuse.

This repository contains a Streamlit app for exploring content moderation with Llama Guard on Groq. Sign up for an account at GroqCloud and get an API token, which you'll need for this project.

Here's a sample response by Llama Guard upon detecting a prompt that violated a specific category.

llama-guard

About

A Streamlit app for exploring content moderation with Llama Guard on Groq.

Topics

Resources

License

Stars

Watchers

Forks

Languages