Chinese, Simplified
SEO Title
| 类别 |
组件 |
所有者 |
闭源或私有 |
OSS 许可证 |
商业使用 |
模型大小(B) | 发布日期 |
代码/论文 |
Star | Description |
| Multi-Model | ImageBind | Meta | License | No | Github | 5.9k | ImageBind One Embedding Space to Bind Them All | |||
| Image | DeepFloyd IF | stability.ai | Github | 6.4k | text-to-image model with a high degree of photorealism and language understanding | |||||
|
Stable Diffusion Version 2 |
stability.ai | MIT, unknown | Github | 23.5k | High-Resolution Image Synthesis with Latent Diffusion Models | |||||
| DALL-E | OpenAI | Modified MIT | Yes | Github | 10.3k | PyTorch package for the discrete VAE used for DALL·E. | ||||
|
DALL·E 2 |
OpenAI | Yes | product | |||||||
| DALLE2-pytorch | lucidrains | MIT | Yes | Github | 9.7k | Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch | ||||
| Speech | Whisper | OpenAI | MIT | Yes | Github | 37.7k | Robust Speech Recognition via Large-Scale Weak Supervision | |||
| MMS | Meta | Yes | paper | |||||||
| Code model | Codex | OpenAI | Yes | 12 | 2021/7/1 | Paper | ||||
| AlphaCode | 41 | Feb 2022 | Competition-Level Code Generation with AlphaCode | |||||||
| starcoder | BigCode | No | Apache | 15 | May 2023 | Github | 4.8k | language model (LM) trained on source code and natural language text | ||
| CodeGen | Salesforce | No | ? | Github | 3.6k | model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex. | ||||
| Replit Code | replit | 3 | May 2023 | replit-code-v1-3b model is a 2.7B LLM trained on 20 languages from the Stack Dedup v1.2 dataset. | ||||||
| CodeGen2 | Salesforce | BSD | Yes | 1, 3, 7, 16 | May 2023 | Github | Code models for program synthesis. | |||
| CodeT5 and CodeT5+ | Salesforce | BSD | Yes | 16 | May 2023 | CodeT5 | CodeT5 and CodeT5+ models for Code Understanding and Generation from Salesforce Research. | |||
| language model
|
GPT | June 2018 | GPT | Improving Language Understanding by Generative Pre-Training | ||||||
| BERT | Oct 2018 | BERT | Bidirectional Encoder Representations from Transformers | |||||||
| RoBERTa | 0.125 - 0.355 | July 2019 | RoBERTa | A Robustly Optimized BERT Pretraining Approach | ||||||
| GPT-2 | 1.5 | Nov 2019 | GPT-2 | Language Models are Unsupervised Multitask Learners | ||||||
| T5 | 0.06 - 11 | Oct 2019 | Flan-T5 | Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | ||||||
| XLNet | Jun 2019 | XLNet | Generalized Autoregressive Pretraining for Language Understanding and Generation | |||||||
| ALBERT | 0.235 | Sep 2019 | ALBERT | A Lite BERT for Self-supervised Learning of Language Representations | ||||||
| CTRL | 1.63 | Sep 2019 | CTRL | CTRL: A Conditional Transformer Language Model for Controllable Generation | ||||||
| GPT 3 | Azure | Yes | 175 | May 2020 | Paper | Language Models are Few-Shot Learners | ||||
| GShard | 600 | Jun 2020 | Paper | GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding | ||||||
| BART | Jul 2020 | BART | Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension | |||||||
| mT5 | 13 | Oct 2020 | mT5 | mT5: A massively multilingual pre-trained text-to-text transformer | ||||||
| PanGu-α | 13 | April 2021 | PanGu-α | PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation | ||||||
| CPM-2 | 198 | Jun 2021 | CPM | CPM-2: Large-scale Cost-effective Pre-trained Language Models | ||||||
| GPT-J 6B | EleutherAI | No | Yes | 6 | June 2021 | GPT-J-6B | A 6 billion parameter, autoregressive text generation model trained on The Pile. | |||
| ERNIE 3.0 | Baidu | Yes | 10 | July 2021 | ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation | |||||
| Jurassic-1 | 178 | Aug 2021 | Jurassic-1: Technical Details and Evaluation | |||||||
| ERNIE 3.0 Titan | 10 | July 2021 | ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation | |||||||
| HyperCLOVA | 82 | Sep 2021 | What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers | |||||||
| FLAN | 137 | 2021/10/1 | Paper | Finetuned Language Models Are Zero-Shot Learners | ||||||
| GPT 3.5 | Azure | Yes | ||||||||
| GPT 4 | Azure | Yes | 2023/3/1 | |||||||
| ERNIE 3.0 | Baidu | Yes | 10 | 2021/7/1 | Paper | |||||
| Jurassic-1 | 178 | 2021/8/1 | Paper | |||||||
| T0 | 11 | Oct 2021 | T0 | Multitask Prompted Training Enables Zero-Shot Task Generalization | ||||||
| Yuan 1.0 | 245 | Oct 2021 | Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning | |||||||
| WebGPT | 175 | Dec 2021 | WebGPT: Browser-assisted question-answering with human feedback | |||||||
| Gopher | 280 | Dec 2021 | Scaling Language Models: Methods, Analysis & Insights from Training Gopher | |||||||
| GLaM | 1200 | Dec 2021 | GLaM: Efficient Scaling of Language Models with Mixture-of-Experts | |||||||
| LaMDA | Bard | Yes | 137 | Jan 2022 | Paper | LaMDA: Language Models for Dialog Applications | ||||
| MT-NLG | 530 | Jan 2022 | Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model | |||||||
| InstructGPT | 175 | Mar 2022 | Training language models to follow instructions with human feedback | |||||||
| Chinchilla | 70 | Mar 2022 | Shows that for a compute budget, the best performances are not achieved by the largest models but by smaller models trained on more data. | |||||||
| GPT-NeoX-20B | 20 | April 2022 | GPT-NeoX-20B | GPT-NeoX-20B: An Open-Source Autoregressive Language Model | ||||||
| Tk-Instruct | 11 | April 2022 | Tk-Instruct-11B | Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks | ||||||
| PALM | Yes | 540 | April 2022 | PaLM: Scaling Language Modeling with Pathways | ||||||
| OPT | Meta | No | Yes | 175 | May 2022 | OPT-13B, OPT-66B ,Paper | OPT: Open Pre-trained Transformer Language Models | |||
| OPT-IML | 30, 175 | Dec 2022 | OPT-IML | OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization | ||||||
| GLM-130B | 130 | Oct 2022 | GLM-130B | GLM-130B: An Open Bilingual Pre-trained Model | ||||||
| AlexaTM | 20 | Aug 2022 | AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model | |||||||
| Flan-T5 | 11 | Oct 2022 | Flan-T5-xxl | Scaling Instruction-Finetuned Language Models | ||||||
| Sparrow | 70 | Sep 2022 | Improving alignment of dialogue agents via targeted human judgements | |||||||
| UL2 | 20 | Oct 2022 | UL2, Flan-UL2 | UL2: Unifying Language Learning Paradigms | ||||||
| U-PaLM | 540 | Oct 2022 | Transcending Scaling Laws with 0.1% Extra Compute | |||||||
| BLOOM | BigScience | Bo | Yes | 176 | Nov 2022 | BLOOM ,Paper | BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | |||
| mT0 | 13 | Nov 2022 | mT0-xxl | Crosslingual Generalization through Multitask Finetuning | ||||||
| Galactica | 0.125 - 120 | Nov 2022 | Galactica | Galactica: A Large Language Model for Science | ||||||
| ChatGPT | Nov 2022 | A model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. | ||||||||
| LLama | Meta | No | No | 7, 13, 33, 65 | 2023/2/1 | Paper,LLaMA | LLaMA: Open and Efficient Foundation Language Models | |||
| GPT-4 | March 2023 | |||||||||
| PanGU-Σ | Yes | 1085 | 2023/3/1 | PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing | ||||||
| BloombergGPT | 50 | March 2023 | BloombergGPT: A Large Language Model for Finance | |||||||
| Cerebras-GPT | Cerebras | No | Yes | 0.111 - 13 | 2023/3 | hf | Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster | |||
| oasst-sft-1-pythia-12b | LAION-AI | No | Yes | 12 | 2023/3 | HF | OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so. | |||
| Pythia | Eleuthera AI | No | Yes | 0.070 - 12 | 2023/3 | A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. | ||||
| StableLM | No | No | 3, 7 | April 2023 | Github | Stability AI's StableLM series of language models | ||||
| Dolly 2.0 | DataBricks | No | Yes | 3, 7, 12 | 2023/4 | Dolly | An instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use. | |||
| DLite | 0.124 - 1.5 | 2023/5 | HF | Lightweight instruction following models which exhibit ChatGPT-like interactivity. | ||||||
| MPT-7B | MosaicML | No | Apache 2 | Yes | 7 | 2023/5/5 | blog | a GPT-style model, and the first in the MosaicML Foundation Series of models. | ||
| h2oGPT | 12 | 2023/5 | HF | h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. | ||||||
| LIMA | 65 | 2023/5 | A 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. | |||||||
| RedPajama-INCITE | 3, 7 | 2023/5 | HF | A family of models including base, instruction-tuned & chat models. | ||||||
| Gorilla | 7 | 2023/5 | Gorilla | Gorilla: Large Language Model Connected with Massive APIs | ||||||
| Med-PaLM 2 | 2023/5 | Towards Expert-Level Medical Question Answering with Large Language Models | ||||||||
| PaLM 2 | 2023/5 | A Language Model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. | ||||||||
| Falcon LLM | 7, 40 | 2023/5 | 7B, 40B | foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens. | ||||||
| Claude | Anthropic | Yes | ||||||||
| GPT-Neo | Eleuthera AI | No | Yes | |||||||
| GPT-Neox | Eleuthera AI | No | Yes | 20 | 2022/2/1 | Paper | ||||
| FastChat-T5-3B | LMSYS | No | Apache | Yes | 2023/4/ | |||||
| OpenLLama | openlm-research | No | Yes | |||||||
| OpenChatKit | Together | No | Yes | |||||||
| YaLM |
Yandex | No | Yes | 100 | 2022/6/1 | Github | ||||
| ChatGLM-6B | TsingHua | No | ChatGLM-6B | No | 6 | 2023/3/1 | Github | |||
| Alpaca | Stanford | No | No | |||||||
| Vicuna | No | No | 13 | 2023/3/1 | Blog | |||||
| StableVicuna | No | No | ||||||||
|
RWKV-4-Raven-7B |
BlinkDL | No | No | |||||||
|
Alpaca-LoRA |
tloen | No | No | |||||||
|
Koala |
BAIR | No | No | 13 | 2023/4/1 | Blog |
- 登录 发表评论
- 20 次浏览
发布日期
星期六, June 24, 2023 - 08:31
最后修改
星期六, June 24, 2023 - 09:02
Article
最新内容
- 3 weeks 5 days ago
- 3 weeks 5 days ago
- 3 weeks 5 days ago
- 3 weeks 5 days ago
- 3 weeks 5 days ago
- 3 weeks 5 days ago
- 3 weeks 5 days ago
- 3 weeks 5 days ago
- 3 weeks 5 days ago
- 3 weeks 5 days ago