百度360必应搜狗淘宝本站头条
当前位置:网站首页 > 技术教程 > 正文

解锁混合专家 (MoE) LLM:你的 MoE 模型可以免费嵌入模型

csdh11 2025-02-09 11:57 41 浏览

我最近发现了一篇有趣的论文,题为“您的混合专家LLM其实是一个免费的嵌入模型”。最近的LLM架构趋势是解码器模型,由于其注意力方法,它不适合嵌入模型。然而,作者透露,混合专家 (MoE) LLM可以作为嵌入模型,应用多种以嵌入为重点的任务,而无需进一步微调。在本文中,首先,让我们回顾一下 MoE,我将介绍它的工作原理及其实际实现。

目录

  1. 什么是混合专家 (MoE)?
  2. MoE 如何作为嵌入模型发挥作用?
  3. 实际实施:利用 MoEE 和 BERTopic

1.什么是混合专家(MoE)?

混合专家 (MoE) 是一种具有多个子网络(称为“专家”)的架构,每个子网络专门处理不同的任务或数据方面。MoE 的优势之一是,它使 AI 模型能够以比相同或更大的模型更少的计算量进行预训练,同时保持或提高质量。因此,如果我们的预算有限,我们可以使用 MoE 实现比密集、类似大小的传统模型更好的模型。就最近的成功而言,Mixtral 8 x 7B 在许多评估数据集上的表现优于 LLaMA 2 70B。

从现在开始,让我们研究一下 MoE 的架构。最近成功的 MoE 使用了 transformer 模型,因此我将重点介绍流行的 transformer MoE 架构。MoE 主要有两个组件,如下所述。

  • MoE 层

在 Transformer 架构中,MoE 将前馈网络 (FFN) 层替换为 MoE 层。每个 MoE 层都有一些专家(例如上图中的 4 个专家),并且每个专家由简单的FFN层组成。请注意,Transformer 中的其他组件(例如自注意力层)共享相同的权重。因此,MoE 的权重数量并不简单。例如,Mixtral 8 x 7B 权重不是 8 x 7 = 56B 而是 47B,因为 MoE 层以外的其他层共享相同的权重。

  • 门控网络

门控网络或路由器是 MoE 中的关键组件。它接收输入标记并为每个标记选择最相关的专家。例如,在上图中,路由器的左侧选择第二个专家来处理单词“more”标记。同时,路由器确定第一个专家来处理单词“Parameters”标记。通常,门控网络选择与给定标记相关的前 k 个专家,并将标记发送给选定的专家;例如,Mixtral 8 x 7B 选择前 2 个专家。

我们如何选择前 k 名专家?我们使用 softmax 函数计算专家的重要性概率并保留前 k 名概率专家,如下所示。我提取了上图中的门控部分。

门控网络有其权重。我们将 softmax 函数应用于输入单词 token 与门控网络权重之间的点积结果,然后得到专家与给定 token 相关的概率。根据该概率,我们可以选出前 k 个相关专家。具有这种门控网络的 MoE 称为稀疏 MoE。

这些是理解 MoE 作为嵌入模型如何工作所需的基础知识。现在,让我们深入了解 MoE 作为嵌入模型的实际工作方式。

2. MoE 如何作为嵌入模型发挥作用?

快速回顾一下嵌入

在深入探讨本节主题之前,让我们快速回顾一下嵌入。最近,嵌入已成为深度学习模型中输入数据的内部表示,它具有语义和浓缩的数据信息。我们通常将神经网络的最后一个隐藏状态提取为嵌入,如下所示。

我们通常使用基于编码器的模型来提取嵌入,因为与仅使用解码器的模型相比,它们可以使用双向注意力来捕获语义。仅使用解码器的模型通常使用因果注意力来仅与前一个单词标记进行交互;因此,它们无法像编码器-解码器模型那样捕获丰富的语义,例如上下文信息。

MoE 如何作为嵌入模型发挥作用?

以前人们普遍认为解码器模型不能用于嵌入提取。然而,作者发现 MoE 中的路由权重为解码器嵌入提供了补充信息。每层中的路由权重反映了对输入 token 的推理选择,因此它包含了隐藏状态嵌入可能丢失的输入语义信息。在数学公式中,我们可以将其描述为:

g是 softmax 函数,H表示隐藏状态。我们将所有 MoE 层的路由权重连接起来,以避免丢失模型的推理选择。

为了充分利用路由权重和解码器嵌入,作者提出了一种称为 MoE 嵌入 (MoEE) 的方法来形成更全面的嵌入表示。MoEE 有两种类型。一种方法是基于连接的组合,如下所述。

这种方法很简单,我们只需将路由权重和解码器嵌入连接起来即可。作者将此方法称为 MoEE(concat)。它可以保留每个路由权重捕获的不同信息,同时允许下游任务利用组合表示。

另一种方法是加权和积分。它对从路由权重和隐藏状态 (HS) 嵌入计算出的相似度得分进行加权和,表示为 MoEE (sum)。此方法用于比较两个句子的任务,例如语义文本相似度。

是控制路由权重贡献的超参数。计算每对的相似度得分后,我们计算计算出的相似度得分与真实相似度之间的秩相关性,例如 Spearman 秩相关性。

对于实际使用,我认为 MoEE(concat) 很容易使用。此外,作者利用 PromptEOL 技术 [4] 来增强 MoEE。此技术提示以下模板来约束 LLM 预测下一个标记中的语义信息。

现在,这是跨 MTEB 任务的性能表。

带有 PromptEOL 的 MoEE 比监督和自监督方法效果更好。请注意,此排行榜不是最新的,因此此结果不是 SOTA。此方法的价值在于我们可以获得不错的嵌入任务结果,并且无需进一步训练即可使用。

到目前为止,我们已经介绍了 MoEE 的工作原理。在下一节中,我们将使用 BERTopic 和聚类句子来实现 MoEE。

3. 实际实施:利用 MoEE 和 BERTopic

在本节中,我们从预先训练的 MoE LLM 中提取嵌入,并使用 20 个新闻组数据集将它们与BERTopic结合使用。供您参考,BERTopic 是一个超越传统统计主题建模的便捷主题建模库。它利用 Transformer 中的嵌入进行主题聚类,因此我认为它适合检查功能。首先,让我们准备一个环境。

环境设置

我使用了带有 Python 3.10 的 conda 环境。我在 Ubuntu 20.04 上进行了实验,使用的是 cuda 12.4、16 GB VRAM。您可能需要 32 GB RAM 来下载模型权重。

conda create -n moee python=3.10 -y
conda activate moee

接下来,我们需要通过 pip 安装下面的库。

pip install transformers torch bitsandbytes bertopic accelerate

MoE 模型通常需要较高的 VRAM,因为我们需要提前将整个模型加载到 VRAM 中。因此,我们需要使用量化包 bitsandbytes 来节省 VRAM 内存。

我们需要克隆官方 GitHub 存储库。

git clone https://github.com/tianyi-lab/MoE-Embedding.git

所有准备工作都已完成。现在,让我们使用 MoEE 通过 BERTopic 实现主题聚类。

利用 MoEE 和 BERTopic

现在,我们将使用 MoEE 作为 BERTopic 的嵌入模型并尝试主题聚类。原始存储库允许我们使用小型 MoE 模型,例如 Qwen-1.5-MoE-A2.7B 或 OLMoE-1B-7B。在本文中,我将使用 OLMoE-1B-7B,它适合在 16 GB VRAM 上运行推理。首先,我们需要加载 OLMoE-1B-7B。

kwargs = {
        "base_model": 'allenai/OLMoE-1B-7B-0924',
        "normalized": False,
        "torch_dtype": torch.bfloat16,
        "mode": "embedding",
        "pooling_method": "mean",
        "attn_implementation": "sdpa",
        "attn": "bbcc",
    }

config = {
    'embed_method': 'prompteol',
    'emb_info': 'MoEE'
    }

embedding_model = MOEE(model_name_or_path='allenai/OLMoE-1B-7B-0924', **kwargs)

接下来,我们需要计算 20 个新闻组数据集的嵌入以通过 BERTopic。

from sklearn.datasets import fetch_20newsgroups

docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']

dataset = MyDataset(docs)
dataloader = DataLoader(dataset=dataset, batch_size=8)
embeddings = None

for batch in tqdm(dataloader):
    with torch.no_grad():      
        embedding = embedding_model.encode(batch, **config)
        
        if embeddings is None:
            embeddings = embedding[0]
        else:
            embeddings = np.vstack((embeddings, embedding[0]))
    
    torch.cuda.empty_cache()

为了提前计算嵌入,我们使用
torch.utils.data.DataLoader 作为迭代器,并对每个批处理文档进行编码。请注意,我们必须将嵌入作为 np.asarray 类型传递给 BERTopic。

当你想使用自己的 MoE 模型时,你必须实现从每个 MoE 层获取路由权重。对于隐藏状态嵌入,我们可以利用 HuggingFace 转换器函数。我们只需要在推理时传递 output_hidden_?states=True 参数。

现在,我们可以运行主题建模。

# Step 2 - Reduce dimensionality
umap_model = UMAP(n_neighbors=15, n_components=5, min_dist=0.0, metric='cosine')

# Step 3 - Cluster reduced embeddings
hdbscan_model = HDBSCAN(min_cluster_size=15, metric='euclidean', cluster_selection_method='eom', prediction_data=True)

# Step 4 - Tokenize topics
vectorizer_model = CountVectorizer(stop_words="english")

# Step 5 - Create topic representation
ctfidf_model = ClassTfidfTransformer()

# Step 6 - (Optional) Fine-tune topic representations with 
# a `bertopic.representation` model
representation_model = KeyBERTInspired()

# All steps together
topic_model = BERTopic(
  embedding_model=embedding_model,          # Step 1 - Extract embeddings
  umap_model=umap_model,                    # Step 2 - Reduce dimensionality
  hdbscan_model=hdbscan_model,              # Step 3 - Cluster reduced embeddings
  vectorizer_model=vectorizer_model,        # Step 4 - Tokenize topics
  ctfidf_model=ctfidf_model,                # Step 5 - Extract topic words
  representation_model=representation_model # Step 6 - (Optional) Fine-tune topic representations
)

# topic modeling using BERTopic model
topics, probs = topic_model.fit_transform(docs, embeddings)

默认设置下我们得到了 42 个主题;下面是一些示例。虽然我随机挑选了主题,但它可以很好地捕捉语义。

此外,这里是主题集群可视化。

请看主题聚类可视化中的红色圆圈。这个红色圆圈指的是主题 0,与计算机相关。更接近的主题也与机械词汇相关,例如图形、数字和打印机。

该方法向我们展示了我们可以在不进行任何训练的情况下获得不错的嵌入。尽管仍有提升质量以达到与 SOTA 监督模型相当的质量的空间,但本文的发现为进一步改进无需训练的嵌入提取方法迈出了良好的一步。

全部代码参考如下。您需要将此文件放入 MoE-Embedding 目录的顶部。

import sys
sys.path.append('.')
import re
import numpy as np

import torch
from torch.utils.data import Dataset, DataLoader
from tqdm import tqdm

from umap import UMAP
from hdbscan import HDBSCAN
from sklearn.feature_extraction.text import CountVectorizer

from bertopic import BERTopic
from bertopic.representation import KeyBERTInspired
from bertopic.vectorizers import ClassTfidfTransformer
from moee import MOEE
/opt/conda/envs/moee/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device
'cuda'
Load dataset
from sklearn.datasets import fetch_20newsgroups

docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']
def remove_punctuation(x: str) -> str:
    cleaned = re.sub(r"[!\"#$%&()*+-./:;<=>?@[\]^_`{|}~\n -' ]", " ", x)
    return cleaned

def clean_caption(x :str) -> str:
    # align the character 
    x = x.lower()
    
    # remove URLs and punctuation
    x = re.sub(r"http\S+", "", x)
    x = re.sub(r"www.\S+", "", x)
    x = remove_punctuation(x)
    x = re.sub(r"  ", " ", x)
    
    return x
docs = [clean_caption(doc) for doc in docs]
Define MoEE and BERTopic
kwargs = {
        "base_model": 'allenai/OLMoE-1B-7B-0924',
        "normalized": False,
        "torch_dtype": torch.bfloat16,
        "mode": "embedding",
        "pooling_method": "mean",
        "attn_implementation": "sdpa",
        "attn": "bbcc",
    }

config = {
    'embed_method': 'prompteol',
    'emb_info': 'MoEE'
    }

embedding_model = MOEE(model_name_or_path='allenai/OLMoE-1B-7B-0924', **kwargs)
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
OlmoeForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From v4.50 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
Loading checkpoint shards: 100%|██████████| 3/3 [00:52<00:00, 17.59s/it]
self.model:  OlmoeForCausalLM(
  (model): OlmoeModel(
    (embed_tokens): Embedding(50304, 2048, padding_idx=1)
    (layers): ModuleList(
      (0-15): 16 x OlmoeDecoderLayer(
        (self_attn): OlmoeSdpaAttention(
          (q_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
          (v_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
          (o_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
          (q_norm): OlmoeRMSNorm((2048,), eps=1e-05)
          (k_norm): OlmoeRMSNorm((2048,), eps=1e-05)
        )
        (mlp): OlmoeSparseMoeBlock(
          (gate): Linear4bit(in_features=2048, out_features=64, bias=False)
          (experts): ModuleList(
            (0-63): 64 x OlmoeMLP(
              (gate_proj): Linear4bit(in_features=2048, out_features=1024, bias=False)
              (up_proj): Linear4bit(in_features=2048, out_features=1024, bias=False)
              (down_proj): Linear4bit(in_features=1024, out_features=2048, bias=False)
              (act_fn): SiLU()
            )
          )
        )
        (input_layernorm): OlmoeRMSNorm((2048,), eps=1e-05)
        (post_attention_layernorm): OlmoeRMSNorm((2048,), eps=1e-05)
      )
    )
    (norm): OlmoeRMSNorm((2048,), eps=1e-05)
    (rotary_emb): OlmoeRotaryEmbedding()
  )
  (lm_head): Linear(in_features=2048, out_features=50304, bias=False)
)
class MyDataset(Dataset):
    """Dataset to pass to `transformers.pipelines.pipeline`."""

    def __init__(self, docs, truncate_token_num: int = 300):
        self.docs = docs
        self.truncate_token_num = truncate_token_num

    def __len__(self):
        return len(self.docs)

    def __getitem__(self, idx):
        if len(self.docs[idx]) > self.truncate_token_num:
            return self.docs[idx][:self.truncate_token_num]
        
        return self.docs[idx]
dataset = MyDataset(docs)
dataloader = DataLoader(dataset=dataset, batch_size=16)
embeddings = None

for batch in tqdm(dataloader):
    with torch.no_grad():      
        embedding = embedding_model.encode(batch, **config)
        
        if embeddings is None:
            embeddings = embedding[0]
        else:
            embeddings = np.vstack((embeddings, embedding[0]))
    
    torch.cuda.empty_cache()
100%|██████████| 2356/2356 [43:44<00:00,  1.11s/it]
np.save('embedding.npy', embeddings)
# Step 2 - Reduce dimensionality
umap_model = UMAP(n_neighbors=15, n_components=5, min_dist=0.0, metric='cosine')

# Step 3 - Cluster reduced embeddings
hdbscan_model = HDBSCAN(min_cluster_size=15, metric='euclidean', cluster_selection_method='eom', prediction_data=True)

# Step 4 - Tokenize topics
vectorizer_model = CountVectorizer(stop_words="english")

# Step 5 - Create topic representation
ctfidf_model = ClassTfidfTransformer()

# Step 6 - (Optional) Fine-tune topic representations with 
# a `bertopic.representation` model
representation_model = KeyBERTInspired()

# All steps together
topic_model = BERTopic(
  embedding_model=embedding_model,          # Step 1 - Extract embeddings
  umap_model=umap_model,                    # Step 2 - Reduce dimensionality
  hdbscan_model=hdbscan_model,              # Step 3 - Cluster reduced embeddings
  vectorizer_model=vectorizer_model,        # Step 4 - Tokenize topics
  ctfidf_model=ctfidf_model,                # Step 5 - Extract topic words
  representation_model=representation_model # Step 6 - (Optional) Fine-tune topic representations
)
topics, probs = topic_model.fit_transform(docs, embeddings)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
topic_model.get_topic_info()
Topic	Count	Name	Representation	Representative_Docs
0	-1	5271	-1_christian_church_believe_read	[christian, church, believe, read, god, eviden...	[i have come across what i consider to be an e...
1	0	4110	0_dos_os_windows_microsoft	[dos, os, windows, microsoft, ms, pc, mac, dis...	[ \t munch \t munch following is reformatted...
2	1	1057	1_scripture_christianity_christians_bible	[scripture, christianity, christians, bible, c...	[ this is something i ve always found a littl...
3	2	1022	2_flyers_puck_nhl_leafs	[flyers, puck, nhl, leafs, sabres, bruins, pla...	[the flyers closed out the season last night w...
4	3	963	3_riding_driving_wheel_bike	[riding, driving, wheel, bike, ride, honda, bi...	[sixteen days i had put off test driving the h...
5	4	902	4_comics_hulk_sale_list	[comics, hulk, sale, list, wolverine, forsale,...	[the following comics are for auction the hig...
6	5	696	5_firearms_guns_handgun_gun	[firearms, guns, handgun, gun, crime, criminal...	[ because the gun loonies were firing on vehic...
7	6	626	6_infections_clinical_diseases_infection	[infections, clinical, diseases, infection, ca...	[ one of the responsibilities of a licensed ph...
8	7	567	7_maybe_mailing_probably_does	[maybe, mailing, probably, does, say, hope, gu...	[ oh yes i m quite sure they will , \ti looke...
9	8	480	8_nasa_spacecraft_shuttle_satellite	[nasa, spacecraft, shuttle, satellite, orbit, ...	[ in fact you probably want to avoid us govern...
10	9	478	9_clipper_encryption_decrypt_cryptography	[clipper, encryption, decrypt, cryptography, c...	[it looks like dorothy denning s wrong headed ...
11	10	380	10____	[, , , , , , , , , ]	[, , ]
12	11	290	11_palestinians_israeli_israelis_gaza	[palestinians, israeli, israelis, gaza, gazans...	[many of you ask me whether i approve of sever...
13	12	249	12_ax_9f_qax_b8f	[ax, 9f, qax, b8f, kn, 6um, pl, m9, max, k8]	[ part 13 of 14 mtm 3v9f0 7ey 7e...
14	13	206	13_armenians_armenian_armenia_azerbaijanis	[armenians, armenian, armenia, azerbaijanis, a...	[accounts of anti armenian human rights violat...
15	14	167	14_archive_graphics_formats_information	[archive, graphics, formats, information, data...	[archive name graphics resources list part1 la...
16	15	145	15_grounded_grounding_ground_outlets	[grounded, grounding, ground, outlets, wiring,...	[ no no nooo the ground green wire is for ...
17	16	133	16_scorer_pittsburgh_pts_pp	[scorer, pittsburgh, pts, pp, stl, 78, 43, det...	[scoring stats for the swedish nhl players apr...
18	17	103	17____	[, , , , , , , , , ]	[ , and a vga monitor e mail , cica indiana ...
19	18	97	18_supplementation_vitamin_vitamins_cancer	[supplementation, vitamin, vitamins, cancer, c...	[ i ll tell you all that i know about chromium...
20	19	87	19_batteries_radio_battery_electronics	[batteries, radio, battery, electronics, elect...	[ in order to emit blue light a semiconductor ...
21	20	86	20_nasa_spacecraft_saturn_astronomy	[nasa, spacecraft, saturn, astronomy, satellit...	[archive name space references last modified ...
22	21	75	21_investigation_bombing_evidence_news	[investigation, bombing, evidence, news, witne...	[i told some friends of mine two weeks ago tha...
23	22	54	22_stephanopoulos_briefing_secretary_president	[stephanopoulos, briefing, secretary, presiden...	[the white house office of the press...
24	23	51	23_send_entries_dos_fpu	[send, entries, dos, fpu, slip, pktmux, guidel...	[here are the standings after game 1 of each o...
25	24	50	24____	[, , , , , , , , , ]	[there seems to be a p pds slot in the above p...
26	25	44	25_islamic_islam_quran_qur	[islamic, islam, quran, qur, muslim, muslims, ...	[ secular laws seem to value criminal life mor...
27	26	42	26_nonsense_claims_censorship_argument	[nonsense, claims, censorship, argument, claim...	[ i m going to cut rex s ramblings down a bit ...
28	27	37	27_paintshop_contacting_sold_sent	[paintshop, contacting, sold, sent, thanks, f5...	[found it thanks i got several offers for help...
29	28	35	28_homosexuality_homosexual_homosexuals_hetero...	[homosexuality, homosexual, homosexuals, heter...	[ can someone tell me why when mr cramer spo...
30	29	35	29_sphere_triangulation_algorithms_perpendicular	[sphere, triangulation, algorithms, perpendicu...	[ good i had a bad feeling about this prob...
31	30	32	30_shortstop_pitchers_outfielder_hitters	[shortstop, pitchers, outfielder, hitters, bas...	[ he s not gone yet the position opening is d...
32	31	32	31_skepticism_geb_n3jxp_gordon	[skepticism, geb, n3jxp, gordon, intellect, in...	[ senile keratoses have nothing to do with th...
33	32	31	32_militia_amendment_constitution_firearm	[militia, amendment, constitution, firearm, li...	[ actually the words a well regulated milita ...
34	33	30	33_subscribe_unsubscribe_subscrive_email	[subscribe, unsubscribe, subscrive, email, wan...	[please subscribe me , please subscribe me , p...
35	34	28	34_speeding_manslaughter_policeman_cop	[speeding, manslaughter, policeman, cop, court...	[pmoloney maths tcd ie paul moloney writes n...
36	35	28	35_modems_modem_mhz_tcp	[modems, modem, mhz, tcp, digital, signal, mai...	[ db 25\tdb 9 pin \tpin \tname\teia\tccitt\tdt...
37	36	24	36_dial_0055_800_930314	[dial, 0055, 800, 930314, number, 9000, 8287, ...	[1 800 832 4778 western digital s voice mail ...
38	37	24	37_inkjet_inkjets_printers_laserjet	[inkjet, inkjets, printers, laserjet, deskjet,...	[fyi the actual horizontal dot placement reso...
39	38	24	38_rangers_adams_quakers_ivy	[rangers, adams, quakers, ivy, douglass, hope,...	[ i think that they go to divisional records b...
40	39	21	39_homosexual_percent_sexual_majority	[homosexual, percent, sexual, majority, percen...	[ from the santa rosa cal press democrat apr...
41	40	19	40_irony_cycnicism_sarcasm_acetone	[irony, cycnicism, sarcasm, acetone, humour, k...	[ \t1 they are religious parodies not atheisti...
42	41	15	41_autobiography_author_book_books	[autobiography, author, book, books, bookstore...	[this is the story of kent the archetype finn ...
topic_model.get_topic(0)
[('dos', np.float32(0.45857304)),
 ('os', np.float32(0.43415424)),
 ('windows', np.float32(0.40028214)),
 ('microsoft', np.float32(0.32284227)),
 ('ms', np.float32(0.31080914)),
 ('pc', np.float32(0.28627717)),
 ('mac', np.float32(0.2705468)),
 ('disk', np.float32(0.26714522)),
 ('scsi', np.float32(0.24755469)),
 ('cx', np.float32(0.2305391))]
topic_model.get_topic(2)
[('flyers', np.float32(0.5347663)),
 ('puck', np.float32(0.4863899)),
 ('nhl', np.float32(0.4710263)),
 ('leafs', np.float32(0.4642067)),
 ('sabres', np.float32(0.45007592)),
 ('bruins', np.float32(0.41095752)),
 ('playoffs', np.float32(0.39904732)),
 ('hockey', np.float32(0.3952221)),
 ('pitching', np.float32(0.39289254)),
 ('braves', np.float32(0.37793285))]
topic_model.get_topic(29)
[('sphere', np.float32(0.42566895)),
 ('triangulation', np.float32(0.42115515)),
 ('algorithms', np.float32(0.37481007)),
 ('perpendicular', np.float32(0.36362517)),
 ('algorithm', np.float32(0.35225672)),
 ('3d', np.float32(0.351159)),
 ('coplanar', np.float32(0.31972635)),
 ('circle', np.float32(0.29665813)),
 ('vertices', np.float32(0.28228626)),
 ('bisector', np.float32(0.2748276))]
topic_model.visualize_topics()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File /opt/conda/envs/moee/lib/python3.10/site-packages/IPython/core/formatters.py:925, in IPythonDisplayFormatter.__call__(self, obj)
    923 method = get_real_method(obj, self.print_method)
    924 if method is not None:
--> 925     method()
    926     return True

File /opt/conda/envs/moee/lib/python3.10/site-packages/plotly/basedatatypes.py:832, in BaseFigure._ipython_display_(self)
    829 import plotly.io as pio
    831 if pio.renderers.render_on_display and pio.renderers.default:
--> 832     pio.show(self)
    833 else:
    834     print(repr(self))

File /opt/conda/envs/moee/lib/python3.10/site-packages/plotly/io/_renderers.py:394, in show(fig, renderer, validate, **kwargs)
    389         raise ValueError(
    390             "Mime type rendering requires ipython but it is not installed"
    391         )
    393     if not nbformat or Version(nbformat.__version__) < Version("4.2.0"):
--> 394         raise ValueError(
    395             "Mime type rendering requires nbformat>=4.2.0 but it is not installed"
    396         )
    398     ipython_display.display(bundle, raw=True)
    400 # external renderers

ValueError: Mime type rendering requires nbformat>=4.2.0 but it is not installed

参考:

https://arxiv.org/pdf/2410.10814

https://huggingface.co/blog/moe

https://arxiv.org/pdf/2101.03961

https://arxiv.org/pdf/2307.16645

相关推荐

探索Java项目中日志系统最佳实践:从入门到精通

探索Java项目中日志系统最佳实践:从入门到精通在现代软件开发中,日志系统如同一位默默无闻却至关重要的管家,它记录了程序运行中的各种事件,为我们排查问题、监控性能和优化系统提供了宝贵的依据。在Java...

用了这么多年的java日志框架,你真的弄懂了吗?

在项目开发过程中,有一个必不可少的环节就是记录日志,相信只要是个程序员都用过,可是咱们自问下,用了这么多年的日志框架,你确定自己真弄懂了日志框架的来龙去脉嘛?下面笔者就详细聊聊java中常用日志框架的...

物理老师教你学Java语言(中篇)(物理专业学编程)

第四章物质的基本结构——类与对象...

一文搞定!Spring Boot3 定时任务操作全攻略

各位互联网大厂的后端开发小伙伴们,在使用SpringBoot3开发项目时,你是否遇到过定时任务实现的难题呢?比如任务调度时间不准确,代码报错却找不到方向,是不是特别头疼?如今,随着互联网业务规模...

你还不懂java的日志系统吗 ?(java的日志类)

一、背景在java的开发中,使用最多也绕不过去的一个话题就是日志,在程序中除了业务代码外,使用最多的就是打印日志。经常听到的这样一句话就是“打个日志调试下”,没错在日常的开发、调试过程中打印日志是常干...

谈谈枚举的新用法--java(java枚举的作用与好处)

问题的由来前段时间改游戏buff功能,干了一件愚蠢的事情,那就是把枚举和运算集合在一起,然后运行一段时间后buff就出现各种问题,我当时懵逼了!事情是这样的,做过游戏的都知道,buff,需要分类型,且...

你还不懂java的日志系统吗(javaw 日志)

一、背景在java的开发中,使用最多也绕不过去的一个话题就是日志,在程序中除了业务代码外,使用最多的就是打印日志。经常听到的这样一句话就是“打个日志调试下”,没错在日常的开发、调试过程中打印日志是常干...

Java 8之后的那些新特性(三):Java System Logger

去年12月份log4j日志框架的一个漏洞,给Java整个行业造成了非常大的影响。这个事情也顺带把log4j这个日志框架推到了争议的最前线。在Java领域,log4j可能相对比较流行。而在log4j之外...

Java开发中的日志管理:让程序“开口说话”

Java开发中的日志管理:让程序“开口说话”日志是程序员的朋友,也是程序的“嘴巴”。它能让程序在运行过程中“开口说话”,告诉我们它的状态、行为以及遇到的问题。在Java开发中,良好的日志管理不仅能帮助...

吊打面试官(十二)--Java语言中ArrayList类一文全掌握

导读...

OS X 效率启动器 Alfred 详解与使用技巧

问:为什么要在Mac上使用效率启动器类应用?答:在非特殊专业用户的环境下,(每天)用户一般可以在系统中进行上百次操作,可以是点击,也可以是拖拽,但这些只是过程,而我们的真正目的是想获得结果,也就是...

Java中 高级的异常处理(java中异常处理的两种方式)

介绍异常处理是软件开发的一个关键方面,尤其是在Java中,这种语言以其稳健性和平台独立性而闻名。正确的异常处理不仅可以防止应用程序崩溃,还有助于调试并向用户提供有意义的反馈。...

【性能调优】全方位教你定位慢SQL,方法介绍下!

1.使用数据库自带工具...

全面了解mysql锁机制(InnoDB)与问题排查

MySQL/InnoDB的加锁,一直是一个常见的话题。例如,数据库如果有高并发请求,如何保证数据完整性?产生死锁问题如何排查并解决?下面是不同锁等级的区别表级锁:开销小,加锁快;不会出现死锁;锁定粒度...

看懂这篇文章,你就懂了数据库死锁产生的场景和解决方法

一、什么是死锁加锁(Locking)是数据库在并发访问时保证数据一致性和完整性的主要机制。任何事务都需要获得相应对象上的锁才能访问数据,读取数据的事务通常只需要获得读锁(共享锁),修改数据的事务需要获...