[go: up one dir, main page]

Skip to content

A curated list of resources dedicated to NLP (paper, blogs, note and etc)

License

Notifications You must be signed in to change notification settings

eagle705/awesome-nlp-note

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

awesome-nlp-note

Awesome

A curated list of resources dedicated to Natural Language Processing and etc(paper, blogs and notes). note: There is some materials which is not directly related to nlp such as python skills.

README Reference:

Contents

Blogs & Youtube

GitHub

Research Summaries and Trends

Environment

NLP in Korean

Back to Top

Libraries

  • KoNLPy - Python package for Korean natural language processing.
  • Mecab (Korean) - C++ library for Korean NLP
  • KoalaNLP - Scala library for Korean Natural Language Processing.

Datasets

Tutorials

Back to Top

Videos and Online Courses

Back to Top

Libraries

Back to Top

  • Python - Python NLP Libraries | Back to Top

    • TextBlob - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of Natural Language Toolkit (NLTK) and Pattern, and plays nicely with both 👍
    • spaCy - Industrial strength NLP with Python and Cython 👍
      • textacy - Higher level NLP built on spaCy
    • gensim - Python library to conduct unsupervised semantic modelling from plain text 👍
    • scattertext - Python library to produce d3 visualizations of how language differs between corpora
    • GluonNLP - A deep learning toolkit for NLP, built on MXNet/Gluon, for research prototyping and industrial deployment of state-of-the-art models on a wide range of NLP tasks.
    • AllenNLP - An NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.
    • PyTorch-NLP - NLP research toolkit designed to support rapid prototyping with better data loaders, word vector loaders, neural network layer representations, common NLP metrics such as BLEU
    • Rosetta - Text processing tools and wrappers (e.g. Vowpal Wabbit)
    • PyNLPl - Python Natural Language Processing Library. General purpose NLP library for Python. Also contains some specific modules for parsing common NLP formats, most notably for FoLiA, but also ARPA language models, Moses phrasetables, GIZA++ alignments.
    • jPTDP - A toolkit for joint part-of-speech (POS) tagging and dependency parsing. jPTDP provides pre-trained models for 40+ languages.
    • BigARTM - a fast library for topic modelling
    • Snips NLU - A production ready library for intent parsing
    • Chazutsu - A library for downloading&parsing standard NLP research datasets
    • Word Forms - Word forms can accurately generate all possible forms of an English word
    • Multilingual Latent Dirichlet Allocation (LDA) - A multilingual and extensible document clustering pipeline
    • NLP Architect - A library for exploring the state-of-the-art deep learning topologies and techniques for NLP and NLU
    • Flair - A very simple framework for state-of-the-art multilingual NLP built on PyTorch. Includes BERT, ELMo and Flair embeddings.
    • Kashgari - Simple, Keras-powered multilingual NLP framework, allows you to build your models in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS) and text classification tasks. Includes BERT and word2vec embedding.

Annotation Tools

  • Label Studio is an open-source, configurable data annotation tool. Its purpose is to enable you to label different types of data using the most convenient interface with a standardized output format.
  • brat - brat rapid annotation tool is an online environment for collaborative text annotation
  • LIDA: Lightweight Interactive Dialogue Annotator (in EMNLP 2019) - LIDA is an open source dialogue annotation system which supports the full pipeline of dialogue annotation from dialogue / turn segmentation from raw text
  • GATE - General Architecture and Text Engineering is 15+ years old, free and open source
  • Anafora is free and open source, web-based raw text annotation tool
  • doccano - doccano is free, open-source, and provides annotation features for text classification, sequence labeling and sequence to sequence
  • tagtog, costs $
  • prodigy is an annotation tool powered by active learning, costs $
  • LightTag - Hosted and managed text annotation tool for teams, costs $
  • rstWeb - open source local or online tool for discourse tree annotations
  • GitDox - open source server annotation tool with GitHub version control and validation for XML data and collaborative spreadsheet grids

About

A curated list of resources dedicated to NLP (paper, blogs, note and etc)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published