[go: up one dir, main page]

Skip to content
View PVTHust's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report PVTHust

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. Speech_project_Vin Speech_project_Vin Public

    Multimodal Speech Emotion Recognition ViT (AST) for audio encoder and Multiscale Attention Net (MANet) for visual encoder

    Python 4 2

  2. project_NLP_final project_NLP_final Public

    This is a group project in the vin program: Modality Balance for Multimodal Conversational Emotion Recognition

    Python 1 1

  3. HySonLab/LightMed HySonLab/LightMed Public

    Light-weight Medical Image Segmentation

    Python 3 2

  4. ichigo ichigo Public

    Forked from homebrewltd/ichigo

    Llama3.1 learns to Listen

    Python

  5. LLaMA-Omni LLaMA-Omni Public

    Forked from ictnlp/LLaMA-Omni

    LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

    Python

  6. mini-omni mini-omni Public

    Forked from gpt-omni/mini-omni

    open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

    Python