Machine Learning | nex3z's blog

Deep Learning Note: 5-13 语音识别

Author: nex3z 2018-02-17

1. 语音识别　　在语音识别问题中，输入是一段语音的音频，输出是语音的文本。就像人类的耳朵不能直接处理声波，而是通过检测声音中不同频率的强度来拾取语音，语音识别的一个常见的预处理步骤是生成原始音频数据的频谱，如图 1 所示，将频谱数据交给算法进行处理。图 1 中下图所示的频谱中，横轴是时间，纵轴是频谱，颜色表示声音在该频率上的能量。　　语音识别系统层一度使用音素（Phoneme）这一人工设计的…
Read more

Machine Learning

Machine Learning, Speech Recognition, Trigger Word Detection

Deep Learning Note: 5-11 Bleu 分数

Author: nex3z 2018-02-16

　　机器翻译与之前介绍的图像识别等任务的一个不同之处是，正确答案不是唯一的。例如对于如下的法语句子： Le chat est sur le tapis. 人类可以给出多种不同的参考英文翻译，这些翻译的质量都很好，如：参考 1：The cat is on the mat. 参考 2：There is a cat on the mat. 答案不唯一为衡量算法的准确度带来了挑战。对于此种情况，通常使用…
Read more

Machine Learning

Bleu Score, Machine Learning

Deep Learning Note: 5-10 Beam 搜索

Author: nex3z 2018-02-15

1. Beam 搜索　　前面提到，对于机器翻译问题，我们希望得到具有最高概率的句子，Beam 搜索就是用于获取这样的句子的算法。　　仍以前面的从法语翻译为英语的任务为例，使用如下的法语句子作为输入： Jane visite l’Afrique en septembre. 　　Beam 搜索的第一步是使用如图 1 所示的网络来计算 $P(y^{\lt 1 \gt}|x)$。在贪婪算法中，我们只是…
Read more

Machine Learning

Beam Search, Error Analysis, Machine Learning

Deep Learning Note: 5-9 Sequence to Sequence 模型

Author: nex3z 2018-02-14

1. 基本模型　　Sequence to Sequence 是一种将一个序列映射到另一个序列的算法，常用于机器翻译和语音识别。　　举例来说，假设想要将一句法语翻译成英语，如图 1 所示。　　Sequence to Sequence 算法使用两个 RNN 来完成翻译任务。第一个 RNN 称为编码器（Encoder），其输入为源语言的文本，这里是法语的句子，其输出为一个向量，是对输入文本的一个编…
Read more

Machine Learning

Machine Learning, Sequence to Sequence

Deep Learning Note: 5-8 Word Embedding 的应用

Author: nex3z 2018-02-13

1. 情感分类　　情感分类（Sentiment Classification）指的是根据一段文本，预测作者是否喜爱文中所讨论的事物。对于情感分类任务，我们可能无法获得大量的训练数据，比如总共只有 1 万到 10 万个单词，但通过 Word Embedding，我们使用不多的数据就可以构建一个很好的情感分类器。　　例如使用顾客对一家餐厅的评价来预测顾客对该餐厅的喜爱程度，输入的 $x$ 是顾客的…
Read more

Machine Learning

Debias, Machine Learning, Sentiment Classification, Word Embedding

Deep Learning Note: 5-6 Word Embedding 介绍

Author: nex3z 2018-02-09

1. 单词的表示方法　　前面在介绍 RNN 时，使用词汇表对单词进行独热编码（One-Hot Encoding）来表示单词。例如使用一个有 10000 个词的词汇表 $V = [a, aaron, …, zulu, \lt UNK \gt]$ 对单词进行编码，则每个单词都会被编码为一个长度为 10000 的向量，其中位置与单词在词汇表中位置相同的项值为 1，其余位置的值为 0，图 1…
Read more

Machine Learning

Machine Learning, Word Embedding

Deep Learning Note: 5-5 双向 RNN 和深度 RNN

Author: nex3z 2018-02-06

1. 双向 RNN 　　之前所介绍的 RNN 存在的一个问题是，在 $t$ 时刻，网络只能根据 $t$ 时刻之前的内容进行预测，而无法看到 $t$ 时刻之后的内容。例如对于如下两句话： He said, “Teddy bears are on sale!” He said, “Teddy Roosevelt was a great President!&#822…
Read more

Machine Learning

Bidirectional RNN, Deep RNN, Machine Learning

Deep Learning Note: 5-3 语言模型

Author: nex3z 2018-02-04

1. 语言建模 1.1. 语言模型　　考虑通过语音识别以下两个句子： [code lang=”java”]The apple and pair salad. The apple and pear salad. [/code] 　　这两个句子的读音完全一样，如果一个人听到这样的句子，可以很自然地认为听到了第二句，而对于算法来说，需要通过语言模型来判断当前输入的语音到底对应了…
Read more

Machine Learning

Language Model, Machine Learning, RNN

Deep Learning Note: 5-2 循环神经网络

Author: nex3z 2018-02-03

1. 循环神经网络　　图 1 展示了一个简单的循环神经网络的结构。　　网络依次处理输入数据 $x$ 中的每个单词：首先处理第一个词 $x^{}$，将它输入到一层神经网络中，得到对 $x^{}$ 这个词的激活值 $a^{}$ 和预测值 $\hat{y}^{}$；然后处理第二个词 $x^{}$，将它和前一层的激活值 $a^{}$ 一起输入到一层神经网络中，得到对 $x^{}$ 这个词的激活值 $a…
Read more

Machine Learning

Machine Learning, RNN

Deep Learning Note: 5-1 序列模型

Author: nex3z 2018-02-03

1. 序列模型　　循环神经网络（Recurrent Neural Network）用于处理序列模型，常见的应用场景有：语音识别（Speech Recognition）：输入一段语音数据，输出语音内容的文本。输入和输出都是序列数据。音乐生成（Music Generation）：没有输入，或输入特定参数（如一个表示音乐风格的数字），输出一段音乐。只有输出是序列数据。情感分类（Sentiment…
Read more

Machine Learning

Machine Learning, RNN, Sequence Model

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Tag Archive: Machine Learning

Deep Learning Note: 5-13 语音识别

Deep Learning Note: 5-11 Bleu 分数

Deep Learning Note: 5-10 Beam 搜索

Deep Learning Note: 5-9 Sequence to Sequence 模型

Deep Learning Note: 5-8 Word Embedding 的应用

Deep Learning Note: 5-6 Word Embedding 介绍

Deep Learning Note: 5-5 双向 RNN 和深度 RNN

Deep Learning Note: 5-3 语言模型

Deep Learning Note: 5-2 循环神经网络

Deep Learning Note: 5-1 序列模型

Post navigation