Natural Language Processing

BioBERT

BioBERT is Bidirectional Encoder Representations from Transformers for Biomedical Text Mining. It is a domain-specific language representation model pre-trained on large-scale biomedical corpora. BioBERT significantly outperforms many text mining methods on the following three representative biomedical text mining tasks: biomedical named entity recognition, biomedical relation extraction and biomedical question answering.

Reference: PMID: 31501885

Contact: Lang Li

BERT

BERT is a transformer-based model that employs bidirectional training of Transformer encoders to achieve state-of-the-art performance in numerous natural language processing tasks at the time of its introduction. The development of BERT laid the groundwork for current generations of large language models, serving as a proof of concept for bidirectional transformer architectures. BERT is built upon the attention mechanism but deviates from the original encoder-decoder structure by focusing solely on the encoder. Its architecture consists of multiple layers of bidirectional self-attention and is pre-trained on a large corpus of text, including the entire Wikipedia and BookCorpus. BERT's key innovation is its ability to capture context from both directions, allowing for a deeper understanding of language semantics.

Reference: https://arxiv.org/abs/1810.04805

Contact: Lang Li

PubBERT

PubBERT is a variant of BERT specifically fine-tuned on a corpus of biomedical literature, such as PubMed articles, to enhance its performance in biomedical text processing tasks. This adaptation allows PubBERT to better understand and generate language pertinent to the biomedical field, improving upon the general BERT model for domain-specific applications.

Contact: Lang Li

Llama 2

Llama 2 is a series of large language models by Meta AI, designed for various NLP tasks, available in three sizes: 7 billion, 13 billion and 70 billion parameters.

Reference: https://arxiv.org/abs/2207.06947

Contact: Ping Zhang

Mistral

Mistral is an advanced language model designed for various natural language processing tasks, available in two sizes: 7 billion and 8x7 billion parameters.

Reference: https://arxiv.org/abs/2307.02022

Contact: Steve Rust

Mixtral

Mixtral combines multiple 7 billion parameter models to enhance performance across various NLP tasks.

Reference: https://arxiv.org/abs/2307.02022

Contact: Steve Rust

Phi 2

Phi 2 is a powerful language model designed for natural language understanding and generation.

Reference: https://arxiv.org/abs/2306.11692

Contact: Steve Rust

Flan-T5

Flan-T5 is a fine-tuned language model from Google, optimized for various NLP tasks, available in sizes Lg 780M and XXL 11B.

Reference: https://arxiv.org/abs/2210.11416

Contact: Steve Rust

Falcon

Falcon is a high-performance language model optimized for speed and efficiency, available in sizes 7 billion and 40 billion parameters.

Reference: https://arxiv.org/abs/2208.13269

Contact: Steve Rust

GatorTron

GatorTron is a biomedical language model developed by the University of Florida, available in sizes S 345M, Medium 3.9B, and Large 9B.

Reference: https://arxiv.org/abs/2207.07177

Contact: Steve Rust

RoBERTa

RoBERTa (Robustly optimized BERT approach) is a transformer-based model that improves on BERT by training with larger mini-batches and more data.

Reference: https://arxiv.org/abs/1907.11692

Contact: Steve Rust

Phi 3

Phi 3 is the latest iteration of the Phi series, designed for advanced natural language processing and understanding.

Reference: https://arxiv.org/abs/2308.11239

Contact: Steve Rust

MPT

MPT is a versatile transformer model available in sizes 7 billion and 30 billion parameters, designed for a wide range of NLP tasks.

Reference: https://arxiv.org/abs/2307.11590

Contact: Steve Rust

OPT

OPT (Open Pre-trained Transformer) is a series of transformer models designed for various NLP tasks, available in multiple sizes.

Reference: https://arxiv.org/abs/2205.01068

Contact: Steve Rust

Natural Language Processing

Natural Language Processing (NLP)