跳转到主要内容

热门内容

今日:


总体:


最近浏览:


Chinese, Simplified
类别

组件

所有者

闭源或私有

OSS 许可证

商业使用

模型大小(B) 发布日期

代码/论文

Star Description
Multi-Model ImageBind Meta   License No     Github 5.9k ImageBind One Embedding Space to Bind Them All
Image DeepFloyd IF stability.ai  

License

Model license

      Github 6.4k text-to-image model with a high degree of photorealism and language understanding

Stable Diffusion Version 2

stability.ai   MIT, unknown       Github 23.5k High-Resolution Image Synthesis with Latent Diffusion Models
DALL-E OpenAI   Modified MIT  Yes     Github 10.3k PyTorch package for the discrete VAE used for DALL·E.

DALL·E 2

OpenAI Yes         product    
DALLE2-pytorch lucidrains   MIT Yes     Github 9.7k Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Speech Whisper OpenAI   MIT Yes     Github 37.7k Robust Speech Recognition via Large-Scale Weak Supervision
MMS Meta Yes           paper  
Code model Codex OpenAI Yes     12 2021/7/1

blog

Paper  
AlphaCode         41 Feb 2022     Competition-Level Code Generation with AlphaCode
starcoder BigCode No Apache   15 May 2023 Github 4.8k language model (LM) trained on source code and natural language text
CodeGen Salesforce No ?       Github 3.6k model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
Replit Code replit       3 May 2023     replit-code-v1-3b model is a 2.7B LLM trained on 20 languages from the Stack Dedup v1.2 dataset.
CodeGen2 Salesforce   BSD Yes 1, 3, 7, 16 May 2023 Github   Code models for program synthesis.
CodeT5 and CodeT5+ Salesforce   BSD Yes 16 May 2023 CodeT5   CodeT5 and CodeT5+ models for Code Understanding and Generation from Salesforce Research.
language model























 

 

 

 

 

 

 

GPT           June 2018 GPT   Improving Language Understanding by Generative Pre-Training
BERT           Oct 2018 BERT   Bidirectional Encoder Representations from Transformers
RoBERTa         0.125 - 0.355 July 2019 RoBERTa   A Robustly Optimized BERT Pretraining Approach
GPT-2         1.5 Nov 2019 GPT-2   Language Models are Unsupervised Multitask Learners
T5         0.06 - 11 Oct 2019 Flan-T5   Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
XLNet           Jun 2019 XLNet   Generalized Autoregressive Pretraining for Language Understanding and Generation
ALBERT         0.235 Sep 2019 ALBERT   A Lite BERT for Self-supervised Learning of Language Representations
CTRL         1.63 Sep 2019 CTRL   CTRL: A Conditional Transformer Language Model for Controllable Generation
GPT 3 Azure Yes     175 May 2020 Paper   Language Models are Few-Shot Learners
GShard         600 Jun 2020 Paper   GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
BART           Jul 2020 BART   Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
mT5         13 Oct 2020 mT5   mT5: A massively multilingual pre-trained text-to-text transformer
PanGu-α         13 April 2021 PanGu-α   PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
CPM-2         198 Jun 2021 CPM   CPM-2: Large-scale Cost-effective Pre-trained Language Models
GPT-J 6B EleutherAI No   Yes 6 June 2021 GPT-J-6B   A 6 billion parameter, autoregressive text generation model trained on The Pile.
ERNIE 3.0 Baidu Yes     10 July 2021     ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
Jurassic-1         178 Aug 2021     Jurassic-1: Technical Details and Evaluation
ERNIE 3.0 Titan         10 July 2021     ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
HyperCLOVA         82 Sep 2021     What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
FLAN         137 2021/10/1 Paper   Finetuned Language Models Are Zero-Shot Learners
                   
GPT 3.5 Azure Yes              
GPT 4 Azure Yes       2023/3/1      
ERNIE 3.0 Baidu Yes     10 2021/7/1 Paper    
Jurassic-1         178 2021/8/1 Paper    
T0         11 Oct 2021 T0   Multitask Prompted Training Enables Zero-Shot Task Generalization
Yuan 1.0         245 Oct 2021     Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
WebGPT         175 Dec 2021     WebGPT: Browser-assisted question-answering with human feedback
Gopher         280 Dec 2021     Scaling Language Models: Methods, Analysis & Insights from Training Gopher
GLaM         1200 Dec 2021     GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
LaMDA Bard Yes     137 Jan 2022 Paper   LaMDA: Language Models for Dialog Applications
MT-NLG         530 Jan 2022     Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
InstructGPT         175 Mar 2022     Training language models to follow instructions with human feedback
Chinchilla         70 Mar 2022     Shows that for a compute budget, the best performances are not achieved by the largest models but by smaller models trained on more data.
GPT-NeoX-20B         20 April 2022 GPT-NeoX-20B   GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Tk-Instruct         11 April 2022 Tk-Instruct-11B   Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
PALM Google Yes     540 April 2022     PaLM: Scaling Language Modeling with Pathways
OPT Meta No   Yes 175 May 2022 OPT-13BOPT-66B ,Paper   OPT: Open Pre-trained Transformer Language Models
OPT-IML         30, 175 Dec 2022 OPT-IML   OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
GLM-130B         130 Oct 2022 GLM-130B   GLM-130B: An Open Bilingual Pre-trained Model
AlexaTM         20 Aug 2022     AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Flan-T5         11 Oct 2022 Flan-T5-xxl   Scaling Instruction-Finetuned Language Models
Sparrow         70 Sep 2022     Improving alignment of dialogue agents via targeted human judgements
UL2         20 Oct 2022 UL2, Flan-UL2   UL2: Unifying Language Learning Paradigms
U-PaLM         540 Oct 2022     Transcending Scaling Laws with 0.1% Extra Compute
BLOOM BigScience Bo   Yes 176 Nov 2022 BLOOM ,Paper   BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
mT0         13 Nov 2022 mT0-xxl   Crosslingual Generalization through Multitask Finetuning
Galactica         0.125 - 120 Nov 2022 Galactica   Galactica: A Large Language Model for Science
ChatGPT           Nov 2022     A model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests.
LLama Meta No   No 7, 13, 33, 65 2023/2/1 Paper,LLaMA   LLaMA: Open and Efficient Foundation Language Models
GPT-4           March 2023      
PanGU-Σ   Yes     1085 2023/3/1     PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing
BloombergGPT         50 March 2023     BloombergGPT: A Large Language Model for Finance
Cerebras-GPT Cerebras No   Yes 0.111 - 13 2023/3 hf   Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
oasst-sft-1-pythia-12b LAION-AI No   Yes 12 2023/3 HF   OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Pythia Eleuthera AI No   Yes 0.070 - 12 2023/3

Pythia,

Paper

  A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters.
StableLM   No   No 3, 7 April 2023 Github   Stability AI's StableLM series of language models
Dolly 2.0 DataBricks No   Yes 3, 7, 12 2023/4 Dolly   An instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use.
DLite         0.124 - 1.5 2023/5 HF   Lightweight instruction following models which exhibit ChatGPT-like interactivity.
MPT-7B MosaicML No Apache 2 Yes 7 2023/5/5 blog   a GPT-style model, and the first in the MosaicML Foundation Series of models.
h2oGPT         12 2023/5 HF   h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities.
LIMA         65 2023/5     A 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling.
RedPajama-INCITE         3, 7 2023/5 HF   A family of models including base, instruction-tuned & chat models.
Gorilla         7 2023/5 Gorilla   Gorilla: Large Language Model Connected with Massive APIs
Med-PaLM 2           2023/5     Towards Expert-Level Medical Question Answering with Large Language Models
PaLM 2           2023/5     A Language Model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM.
Falcon LLM         7, 40 2023/5 7B40B    foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens. 
                   
Claude Anthropic Yes              
GPT-Neo Eleuthera AI No   Yes          
GPT-Neox Eleuthera AI No   Yes 20 2022/2/1 Paper    
FastChat-T5-3B LMSYS No Apache Yes   2023/4/      
OpenLLama openlm-research No   Yes          
OpenChatKit Together No   Yes          
YaLM

 
Yandex No   Yes 100 2022/6/1 Github    
ChatGLM-6B TsingHua No ChatGLM-6B No 6 2023/3/1 Github    
Alpaca Stanford No   No          
Vicuna   No   No 13 2023/3/1 Blog    
StableVicuna   No   No          

RWKV-4-Raven-7B

BlinkDL No   No          

Alpaca-LoRA

tloen No   No          

Koala

BAIR No   No 13 2023/4/1 Blog    
本文地址
最后修改
星期六, June 24, 2023 - 09:02
Tags
 
Article