首页 > 资讯 > 后端开发 > Python >nlp自然语言处理学习CBOW模型类实现示例解析

411

分享到

nlp自然语言处理学习CBOW模型类实现示例解析

2024-04-02 19:04:59 411人浏览安东尼

Python 官方文档：入门教程 => 点击学习

摘要

目录实现CBOW模型类Trainer类的实现实现CBOW模型类初始化：初始化方法的参数包括词汇个数 vocab_size 和中间层的神经元个数 hidden_size。首先生成两

实现CBOW模型类

初始化：初始化方法的参数包括词汇个数 vocab_size 和中间层的神经元个数 hidden_size。首先生成两个权重（W_in 和 W_out），并用一些小的随机值初始化这两个权重。设置astype(‘f’)，初始化将使用 32 位的浮点数。

生成层：生成两个输入侧的 MatMul 层、一个输出侧的 MatMul 层，以及一个 Softmax with Loss 层。

保存权重和梯度：将该神经网络中使用的权重参数和梯度分别保存在列表类型的成员变量 params 和 grads 中。

正向传播 forward() 函数：该函数接收参数 contexts 和 target，并返回损失（loss）。这两个参数结构如下。

contexts 是一个三维 NumPy 数组，第 0 维的元素个数是 mini-batch 的数量，第 1 维的元素个数是上下文的窗口大小，第 2 维表示 one-hot 向量。下面这个代码取出来的是什么？

 h0 = self.in_layer0.forward(contexts[:, 0])
 h1 = self.in_layer1.forward(contexts[:, 1])

jym做了一个测试：

import sys
sys.path.append('..')
from common.util import preprocess #, create_co_matrix, most_similar
from common.util import create_contexts_target, convert_one_hot
text = 'You say Goodbye and I say hello.'
corpus, Word_to_id, id_to_word = preprocess(text)
contexts, target = create_contexts_target(corpus, window_size=1)
#print(contexts)
#print(target)
vocab_size = len(word_to_id)
target = convert_one_hot(target, vocab_size)
contexts = convert_one_hot(contexts, vocab_size)
print(contexts[:, 0])

输出：然后从输出就知道了，取的是不同target的左边的单词。

[[1 0 0 0 0 0 0]
[0 1 0 0 0 0 0]
[0 0 1 0 0 0 0]
[0 0 0 1 0 0 0]
[0 0 0 0 1 0 0]
[0 1 0 0 0 0 0]]

反向传播 backward()：神经网络的反向传播在与正向传播相反的方向上传播梯度。这个反向传播从 1 出发，并将其传向 Softmax with Loss 层。然后，将 Softmax with Loss 层的反向传播的输出 ds 传到输出侧的 MatMul 层。“×”的反向传播将正向传播时的输入值“交换”后乘以梯度。“+”的反向传播将梯度“原样”传播。

这个backward函数里面调用的是之前写好的层的反向传播函数，比如loss_layer.backward(dout)，因此backward函数用完之后，各个权重参数的梯度就保存在了成员变量 grads 中(这是之前写的层里面的反向传播函数来实现的)。先调用 forward() 函数，再调用 backward() 函数，grads 列表中的梯度被更新。

import sys
sys.path.append('..')
import numpy as np
from common.layers import MatMul, SoftmaxWithLoss
class SimpleCBOW:
    def __init__(self, vocab_size, hidden_size):
        V, H = vocab_size, hidden_size
        # 初始化权重
        W_in = 0.01 * np.random.randn(V, H).astype('f')
        W_out = 0.01 * np.random.randn(H, V).astype('f')
        # 生成层
        self.in_layer0 = MatMul(W_in)
        self.in_layer1 = MatMul(W_in)
        self.out_layer = MatMul(W_out)
        self.loss_layer = SoftmaxWithLoss()
        # 将所有的权重和梯度整理到列表中
        layers = [self.in_layer0, self.in_layer1, self.out_layer]
        self.params, self.grads = [], []
        for layer in layers:
            self.params += layer.params
            self.grads += layer.grads
        # 将单词的分布式表示设置为成员变量
        self.word_vecs = W_in
    def forward(self, contexts, target):
        h0 = self.in_layer0.forward(contexts[:, 0])
        h1 = self.in_layer1.forward(contexts[:, 1])
        h = (h0 + h1) * 0.5
        score = self.out_layer.forward(h)
        loss = self.loss_layer.forward(score, target)
        return loss
    def backward(self, dout=1):
        ds = self.loss_layer.backward(dout)
        da = self.out_layer.backward(ds)
        da *= 0.5
        self.in_layer1.backward(da)
        self.in_layer0.backward(da)
        return None

Trainer类的实现

CBOW 模型的学习的实现：给神经网络准备好学习数据。然后求梯度，并逐步更新权重参数。

Trainer类：学习的类。

初始化：类的初始化程序接收神经网络（模型）和优化器(SGD、Momentum、AdaGrad、Adam)

学习：调用 fit() 方法开始学习。参数：x，输入数据；t，监督标签；max_epoch，进行学习的 epoch 数；batch_size，mini-batch 的大小；eval_interval，输出结果（平均损失等）的间隔。例如设置 eval_interval=20，则每 20 次迭代计算 1 次平均损失，并将结果输出到界面上；max_grad，梯度的最大范数。当梯度的范数超过这个值时，缩小梯度。

 def fit(self, x, t, max_epoch=10, batch_size=32, max_grad=None, eval_interval=20):

plot方法：画出 fit() 方法记录的损失(按照 eval_interval 评价的平均损失)。

class Trainer:
    def __init__(self, model, optimizer):
        self.model = model
        self.optimizer = optimizer
        self.loss_list = []
        self.eval_interval = None
        self.current_epoch = 0
    def fit(self, x, t, max_epoch=10, batch_size=32, max_grad=None, eval_interval=20):
        data_size = len(x)
        max_iters = data_size // batch_size
        self.eval_interval = eval_interval
        model, optimizer = self.model, self.optimizer
        total_loss = 0
        loss_count = 0
        start_time = time.time()
        for epoch in range(max_epoch):
            # 打乱
            idx = numpy.random.permutation(numpy.arange(data_size))
            x = x[idx]
            t = t[idx]
            for iters in range(max_iters):
                batch_x = x[iters*batch_size:(iters+1)*batch_size]
                batch_t = t[iters*batch_size:(iters+1)*batch_size]

                # 计算梯度，更新参数
                loss = model.forward(batch_x, batch_t)
                model.backward()
                params, grads = remove_duplicate(model.params, model.grads)  # 将共享的权重整合为1个
                if max_grad is not None:
                    clip_grads(grads, max_grad)
                optimizer.update(params, grads)
                total_loss += loss
                loss_count += 1
                # 评价
                if (eval_interval is not None) and (iters % eval_interval) == 0:
                    avg_loss = total_loss / loss_count
                    elapsed_time = time.time() - start_time
                    print('| epoch %d |  iter %d / %d | time %d[s] | loss %.2f'
                          % (self.current_epoch + 1, iters + 1, max_iters, elapsed_time, avg_loss))
                    self.loss_list.append(float(avg_loss))
                    total_loss, loss_count = 0, 0
            self.current_epoch += 1
    def plot(self, ylim=None):
        x = numpy.arange(len(self.loss_list))
        if ylim is not None:
            plt.ylim(*ylim)
        plt.plot(x, self.loss_list, label='train')
        plt.xlabel('iterations (x' + str(self.eval_interval) + ')')
        plt.ylabel('loss')
        plt.show()

这里面使用Trainer 类来执行CBOW 模型的学习。

这个model其实存的就是SimpleCBOW的成员变量。

model = SimpleCBOW(vocab_size, hidden_size)

下面是调用Trainer 类：

trainer = Trainer(model, optimizer)
trainer.fit(contexts, target, max_epoch, batch_size)
trainer.plot()

# coding: utf-8
import sys
sys.path.append('..')  # 为了引入父目录的文件而进行的设定
from common.trainer import Trainer
from common.optimizer import Adam
from simple_cbow import SimpleCBOW
from common.util import preprocess, create_contexts_target, convert_one_hot
window_size = 1
hidden_size = 5
batch_size = 3
max_epoch = 1000
text = 'You say goodbye and I say hello.'
corpus, word_to_id, id_to_word = preprocess(text)
vocab_size = len(word_to_id)
contexts, target = create_contexts_target(corpus, window_size)
target = convert_one_hot(target, vocab_size)
contexts = convert_one_hot(contexts, vocab_size)
model = SimpleCBOW(vocab_size, hidden_size)
optimizer = Adam()
trainer = Trainer(model, optimizer)
trainer.fit(contexts, target, max_epoch, batch_size)
trainer.plot()
word_vecs = model.word_vecs
for word_id, word in id_to_word.items():
    print(word, word_vecs[word_id])

结果：

SimpleCBOW类里面成员变量有下面这个：权重矩阵W_in就是单词的分布式表示。

# 将单词的分布式表示设置为成员变量
self.word_vecs = W_in

那就可以看看单词的分布式表示。

word_vecs = model.word_vecs
for word_id, word in id_to_word.items():
    print(word, word_vecs[word_id])

结果如下：可见，单词表示为了密集向量

you [-0.9987413 1.0136298 -1.4921554 0.97300434 1.0181936 ]
say [ 1.161595 -1.1513934 -0.25779223 -1.1773298 -1.1531342 ]
goodbye [-0.88470864 0.9155085 -0.30859873 0.9318609 0.9092796 ]
and [ 0.7929211 -0.8148116 -1.8787507 -0.7845257 -0.8028278]
i [-0.8925459 0.95505357 -0.29667985 0.90895575 0.90703803]
hello [-1.0259517 0.97562104 -1.5057516 0.96239203 1.0297285 ]
. [ 1.2134467 -1.1766206 1.6439314 -1.1993438 -1.1676227]

这里面为啥是5个数，其实还是在于权重矩阵W。在SimpleCBOW类里面W_in大小是跟单词数目和hidden_size有关的。

V, H = vocab_size, hidden_size
 # 初始化权重
W_in = 0.01 * np.random.randn(V, H).astype('f')

在使用Trainer 类来执行CBOW 模型的学习时，设置的hidden_size = 5，所以最后单词就表示成包含五个数的向量了。

CBOW模型的学习：调整权重，以使预测准确。也就是说，上下文是 you 和 goodbye，正确解标签应该是 say，那么如果网络具有良好的权重，对应正确解的神经元(say)的得分应该更高。

对神经网络进行学习，其实是用了Softmax 函数和交叉熵误差。使用 Softmax 函数将得分转化为概率，再求这些概率和监督标签之间的交叉熵误差，并将其作为损失进行学习。推理的 CBOW 模型加上 Softmax 层和 Cross Entropy Error 层，就可以得到损失。

输入侧和输出侧的权重都可以被视为单词的分布式表示，这里面只使用输入侧的权重作为单词的分布式表示。

最后把之前写的CBOW模型类放上来：

class SimpleCBOW:
    def __init__(self, vocab_size, hidden_size):
        V, H = vocab_size, hidden_size
        # 初始化权重
        W_in = 0.01 * np.random.randn(V, H).astype('f')
        W_out = 0.01 * np.random.randn(H, V).astype('f')
        # 生成层
        self.in_layer0 = MatMul(W_in)
        self.in_layer1 = MatMul(W_in)
        self.out_layer = MatMul(W_out)
        self.loss_layer = SoftmaxWithLoss()
        # 将所有的权重和梯度整理到列表中
        layers = [self.in_layer0, self.in_layer1, self.out_layer]
        self.params, self.grads = [], []
        for layer in layers:
            self.params += layer.params
            self.grads += layer.grads
        # 将单词的分布式表示设置为成员变量
        self.word_vecs = W_in
    def forward(self, contexts, target):
        h0 = self.in_layer0.forward(contexts[:, 0])
        h1 = self.in_layer1.forward(contexts[:, 1])
        h = (h0 + h1) * 0.5
        score = self.out_layer.forward(h)
        loss = self.loss_layer.forward(score, target)
        return loss
    def backward(self, dout=1):
        ds = self.loss_layer.backward(dout)
        da = self.out_layer.backward(ds)
        da *= 0.5
        self.in_layer1.backward(da)
        self.in_layer0.backward(da)
        return None

以上就是NLP自然语言处理学习CBOW模型类实现示例解析的详细内容，更多关于nlp自然语言CBOW模型类的资料请关注编程网其它相关文章！

您可能感兴趣的文档:

--结束END--

本文标题: nlp自然语言处理学习CBOW模型类实现示例解析

本文链接: https://lsjlt.com/news/145022.html(转载时请注明来源链接)

有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341

回答

如何调试操作系统的错误？
操作系统

2023-11-15发布

回答

操作系统中的I/O系统是如何实现的？
操作系统

2023-11-15发布

回答

如何实现操作系统的内存管理？
操作系统

2023-11-15发布

回答

什么是虚拟内存，它对操作系统有什么影响？
操作系统

2023-11-15发布

回答

ASP中的MVC架构和WebForms架构有什么区别和使用场景？
ASP.NET

2023-11-15发布

回答

ASP中的数据验证和数据校验有什么不同？
ASP.NET

2023-11-15发布

回答

ASP中的ADO对象和DAO对象有什么区别和使用方法？
ASP.NET

2023-11-15发布

回答

Node.js中的包管理器NPM是什么？如何使用它进行依赖管理？
node.js

2023-11-15发布

回答

Vue.js中的动态组件是什么？如何使用它来动态渲染组件？
VUE

2023-11-15发布

回答

如何使用Vue.js实现懒加载和预加载？
VUE

2023-11-15发布

nlp自然语言处理学习CBOW模型类实现示例解析

目录

实现CBOW模型类

Trainer类的实现

nlp自然语言处理学习CBOW模型类实现示例解析

nlp自然语言处理CBOW模型类怎么实现

Python机器学习NLP自然语言处理基本操作词袋模型

Python机器学习NLP自然语言处理基本操作词向量模型

Python机器学习NLP自然语言处理Word2vec电影影评建模

Python机器学习NLP自然语言处理基本操作新闻分类

Python机器学习NLP自然语言处理基本操作家暴归类

nlp自然语言处理基于SVD的降维优化学习

Python机器学习NLP自然语言处理基本操作之命名实例提取

Python机器学习NLP自然语言处理基本操作电影影评分析

Python机器学习NLP自然语言处理基本操作关键词

Python机器学习NLP自然语言处理基本操作之京东评论分类

Python机器学习NLP自然语言处理基本操作精确分词

Python机器学习NLP自然语言处理基本操作之Seq2seq的用法

自然语言处理新手必读：Go语言实现LeetCode自然语言处理题的思路解析。

Python 自然语言处理：学习笔记中打包模块的实现方法

Python 学习笔记：如何打包自然语言处理模块？

人工智能界的语言学：Python 自然语言处理中的语言模型

自然语言处理NLPTextRNN实现情感分类

自然语言处理学习笔记必备：Python 容器详解

python分析数据的方法是什么

如何使用Python实现抽奖小程序

python copy函数的作用是什么

python ffmpeg模块怎么安装和使用

python进程池创建队列的方法是什么

python无法运行文件的原因有哪些

python can't open file报错怎么解决

python keyerror错误怎么解决

python字符串处理与应用的方法有哪些

python全局变量如何定义