Chinese, Simplified
SEO Title
类别 |
组件 |
所有者 |
闭源或私有 |
OSS 许可证 |
商业使用 |
模型大小(B) | 发布日期 |
代码/论文 |
Star | Description |
Multi-Model | ImageBind | Meta | License | No | Github | 5.9k | ImageBind One Embedding Space to Bind Them All | |||
Image | DeepFloyd IF | stability.ai | Github | 6.4k | text-to-image model with a high degree of photorealism and language understanding | |||||
Stable Diffusion Version 2 |
stability.ai | MIT, unknown | Github | 23.5k | High-Resolution Image Synthesis with Latent Diffusion Models | |||||
DALL-E | OpenAI | Modified MIT | Yes | Github | 10.3k | PyTorch package for the discrete VAE used for DALL·E. | ||||
DALL·E 2 |
OpenAI | Yes | product | |||||||
DALLE2-pytorch | lucidrains | MIT | Yes | Github | 9.7k | Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch | ||||
Speech | Whisper | OpenAI | MIT | Yes | Github | 37.7k | Robust Speech Recognition via Large-Scale Weak Supervision | |||
MMS | Meta | Yes | paper | |||||||
Code model | Codex | OpenAI | Yes | 12 | 2021/7/1 | Paper | ||||
AlphaCode | 41 | Feb 2022 | Competition-Level Code Generation with AlphaCode | |||||||
starcoder | BigCode | No | Apache | 15 | May 2023 | Github | 4.8k | language model (LM) trained on source code and natural language text | ||
CodeGen | Salesforce | No | ? | Github | 3.6k | model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex. | ||||
Replit Code | replit | 3 | May 2023 | replit-code-v1-3b model is a 2.7B LLM trained on 20 languages from the Stack Dedup v1.2 dataset. | ||||||
CodeGen2 | Salesforce | BSD | Yes | 1, 3, 7, 16 | May 2023 | Github | Code models for program synthesis. | |||
CodeT5 and CodeT5+ | Salesforce | BSD | Yes | 16 | May 2023 | CodeT5 | CodeT5 and CodeT5+ models for Code Understanding and Generation from Salesforce Research. | |||
language model
|
GPT | June 2018 | GPT | Improving Language Understanding by Generative Pre-Training | ||||||
BERT | Oct 2018 | BERT | Bidirectional Encoder Representations from Transformers | |||||||
RoBERTa | 0.125 - 0.355 | July 2019 | RoBERTa | A Robustly Optimized BERT Pretraining Approach | ||||||
GPT-2 | 1.5 | Nov 2019 | GPT-2 | Language Models are Unsupervised Multitask Learners | ||||||
T5 | 0.06 - 11 | Oct 2019 | Flan-T5 | Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | ||||||
XLNet | Jun 2019 | XLNet | Generalized Autoregressive Pretraining for Language Understanding and Generation | |||||||
ALBERT | 0.235 | Sep 2019 | ALBERT | A Lite BERT for Self-supervised Learning of Language Representations | ||||||
CTRL | 1.63 | Sep 2019 | CTRL | CTRL: A Conditional Transformer Language Model for Controllable Generation | ||||||
GPT 3 | Azure | Yes | 175 | May 2020 | Paper | Language Models are Few-Shot Learners | ||||
GShard | 600 | Jun 2020 | Paper | GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding | ||||||
BART | Jul 2020 | BART | Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension | |||||||
mT5 | 13 | Oct 2020 | mT5 | mT5: A massively multilingual pre-trained text-to-text transformer | ||||||
PanGu-α | 13 | April 2021 | PanGu-α | PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation | ||||||
CPM-2 | 198 | Jun 2021 | CPM | CPM-2: Large-scale Cost-effective Pre-trained Language Models | ||||||
GPT-J 6B | EleutherAI | No | Yes | 6 | June 2021 | GPT-J-6B | A 6 billion parameter, autoregressive text generation model trained on The Pile. | |||
ERNIE 3.0 | Baidu | Yes | 10 | July 2021 | ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation | |||||
Jurassic-1 | 178 | Aug 2021 | Jurassic-1: Technical Details and Evaluation | |||||||
ERNIE 3.0 Titan | 10 | July 2021 | ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation | |||||||
HyperCLOVA | 82 | Sep 2021 | What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers | |||||||
FLAN | 137 | 2021/10/1 | Paper | Finetuned Language Models Are Zero-Shot Learners | ||||||
GPT 3.5 | Azure | Yes | ||||||||
GPT 4 | Azure | Yes | 2023/3/1 | |||||||
ERNIE 3.0 | Baidu | Yes | 10 | 2021/7/1 | Paper | |||||
Jurassic-1 | 178 | 2021/8/1 | Paper | |||||||
T0 | 11 | Oct 2021 | T0 | Multitask Prompted Training Enables Zero-Shot Task Generalization | ||||||
Yuan 1.0 | 245 | Oct 2021 | Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning | |||||||
WebGPT | 175 | Dec 2021 | WebGPT: Browser-assisted question-answering with human feedback | |||||||
Gopher | 280 | Dec 2021 | Scaling Language Models: Methods, Analysis & Insights from Training Gopher | |||||||
GLaM | 1200 | Dec 2021 | GLaM: Efficient Scaling of Language Models with Mixture-of-Experts | |||||||
LaMDA | Bard | Yes | 137 | Jan 2022 | Paper | LaMDA: Language Models for Dialog Applications | ||||
MT-NLG | 530 | Jan 2022 | Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model | |||||||
InstructGPT | 175 | Mar 2022 | Training language models to follow instructions with human feedback | |||||||
Chinchilla | 70 | Mar 2022 | Shows that for a compute budget, the best performances are not achieved by the largest models but by smaller models trained on more data. | |||||||
GPT-NeoX-20B | 20 | April 2022 | GPT-NeoX-20B | GPT-NeoX-20B: An Open-Source Autoregressive Language Model | ||||||
Tk-Instruct | 11 | April 2022 | Tk-Instruct-11B | Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks | ||||||
PALM | Yes | 540 | April 2022 | PaLM: Scaling Language Modeling with Pathways | ||||||
OPT | Meta | No | Yes | 175 | May 2022 | OPT-13B, OPT-66B ,Paper | OPT: Open Pre-trained Transformer Language Models | |||
OPT-IML | 30, 175 | Dec 2022 | OPT-IML | OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization | ||||||
GLM-130B | 130 | Oct 2022 | GLM-130B | GLM-130B: An Open Bilingual Pre-trained Model | ||||||
AlexaTM | 20 | Aug 2022 | AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model | |||||||
Flan-T5 | 11 | Oct 2022 | Flan-T5-xxl | Scaling Instruction-Finetuned Language Models | ||||||
Sparrow | 70 | Sep 2022 | Improving alignment of dialogue agents via targeted human judgements | |||||||
UL2 | 20 | Oct 2022 | UL2, Flan-UL2 | UL2: Unifying Language Learning Paradigms | ||||||
U-PaLM | 540 | Oct 2022 | Transcending Scaling Laws with 0.1% Extra Compute | |||||||
BLOOM | BigScience | Bo | Yes | 176 | Nov 2022 | BLOOM ,Paper | BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | |||
mT0 | 13 | Nov 2022 | mT0-xxl | Crosslingual Generalization through Multitask Finetuning | ||||||
Galactica | 0.125 - 120 | Nov 2022 | Galactica | Galactica: A Large Language Model for Science | ||||||
ChatGPT | Nov 2022 | A model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. | ||||||||
LLama | Meta | No | No | 7, 13, 33, 65 | 2023/2/1 | Paper,LLaMA | LLaMA: Open and Efficient Foundation Language Models | |||
GPT-4 | March 2023 | |||||||||
PanGU-Σ | Yes | 1085 | 2023/3/1 | PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing | ||||||
BloombergGPT | 50 | March 2023 | BloombergGPT: A Large Language Model for Finance | |||||||
Cerebras-GPT | Cerebras | No | Yes | 0.111 - 13 | 2023/3 | hf | Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster | |||
oasst-sft-1-pythia-12b | LAION-AI | No | Yes | 12 | 2023/3 | HF | OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so. | |||
Pythia | Eleuthera AI | No | Yes | 0.070 - 12 | 2023/3 | A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. | ||||
StableLM | No | No | 3, 7 | April 2023 | Github | Stability AI's StableLM series of language models | ||||
Dolly 2.0 | DataBricks | No | Yes | 3, 7, 12 | 2023/4 | Dolly | An instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use. | |||
DLite | 0.124 - 1.5 | 2023/5 | HF | Lightweight instruction following models which exhibit ChatGPT-like interactivity. | ||||||
MPT-7B | MosaicML | No | Apache 2 | Yes | 7 | 2023/5/5 | blog | a GPT-style model, and the first in the MosaicML Foundation Series of models. | ||
h2oGPT | 12 | 2023/5 | HF | h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. | ||||||
LIMA | 65 | 2023/5 | A 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. | |||||||
RedPajama-INCITE | 3, 7 | 2023/5 | HF | A family of models including base, instruction-tuned & chat models. | ||||||
Gorilla | 7 | 2023/5 | Gorilla | Gorilla: Large Language Model Connected with Massive APIs | ||||||
Med-PaLM 2 | 2023/5 | Towards Expert-Level Medical Question Answering with Large Language Models | ||||||||
PaLM 2 | 2023/5 | A Language Model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. | ||||||||
Falcon LLM | 7, 40 | 2023/5 | 7B, 40B | foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens. | ||||||
Claude | Anthropic | Yes | ||||||||
GPT-Neo | Eleuthera AI | No | Yes | |||||||
GPT-Neox | Eleuthera AI | No | Yes | 20 | 2022/2/1 | Paper | ||||
FastChat-T5-3B | LMSYS | No | Apache | Yes | 2023/4/ | |||||
OpenLLama | openlm-research | No | Yes | |||||||
OpenChatKit | Together | No | Yes | |||||||
YaLM |
Yandex | No | Yes | 100 | 2022/6/1 | Github | ||||
ChatGLM-6B | TsingHua | No | ChatGLM-6B | No | 6 | 2023/3/1 | Github | |||
Alpaca | Stanford | No | No | |||||||
Vicuna | No | No | 13 | 2023/3/1 | Blog | |||||
StableVicuna | No | No | ||||||||
RWKV-4-Raven-7B |
BlinkDL | No | No | |||||||
Alpaca-LoRA |
tloen | No | No | |||||||
Koala |
BAIR | No | No | 13 | 2023/4/1 | Blog |
- 登录 发表评论
- 15 次浏览
发布日期
星期六, June 24, 2023 - 08:31
最后修改
星期六, June 24, 2023 - 09:02
Article
最新内容
- 1 day 15 hours ago
- 1 day 15 hours ago
- 4 days 17 hours ago
- 5 days 6 hours ago
- 6 days 17 hours ago
- 1 week ago
- 1 week ago
- 1 week ago
- 1 week ago
- 1 week ago