x shape: (n, ns)
拉普拉斯平滑
在统计单词 / 连续单词出现次数后,计算出现概率
齐普夫定律
分布满足对数坐标系上的下降直线。一元语法,n 元语法均遵守这个分布。
构造的数据集形如
for X, Y in seq_data_iter_sequential(my_seq, batch_size=2, num_steps=5):
print('X: ', X, '\nY:', Y)
# x shape: (n, ns)
# y shape: (n, ns)
X: tensor([[ 2, 3, 4, 5, 6],
[18, 19, 20, 21, 22]])
Y: tensor([[ 3, 4, 5, 6, 7],
[19, 20, 21, 22, 23]])
X: tensor([[ 7, 8, 9, 10, 11],
[23, 24, 25, 26, 27]])
Y: tensor([[ 8, 9, 10, 11, 12],
[24, 25, 26, 27, 28]])
X: tensor([[12, 13, 14, 15, 16],
[28, 29, 30, 31, 32]])
Y: tensor([[13, 14, 15, 16, 17],
[29, 30, 31, 32, 33]])
44:25