2019年6月13日 星期四


1. 字元等級神經語言模型(character-level neural language model):
使用LSTM層從文字庫中(尼采文章)以N個字元的字串作為輸入, 學習預測第N+1個字元的機率分佈,來建立字元等級神經語言模型(character-level neural language model).

2. 以逐一字元生成的方式產生文字資料

import keras
import numpy as np

path = keras.utils.get_file(
text = open(path).read().lower()
print('Corpus length:', len(text))

maxlen = 60
step = 3

sentences = []

next_chars = []

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen]) #從i到i+maxlen-1
    next_chars.append(text[i + maxlen]) #i+maxlen

print('Number of sequences:', len(sentences))

chars = sorted(list(set(text))) #列出所有出現在文章的字元
print('Unique characters:', len(chars))
char_indices = dict((char, chars.index(char)) for char in chars)

x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax')) #softmax for機率分佈

optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1) #丟一次骰子作為一次實驗,每一面出現機率為preds,回傳每個字元出現的“次數”串列
    return np.argmax(probas) #回傳結果陣列最大的索引值

import random
import sys

for epoch in range(1, 60):
    print('epoch', epoch)
    model.fit(x, y, batch_size=128, epochs=1)
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')
    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)  #印出generated_text

        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.
            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]
            generated_text += next_char
            generated_text = generated_text[1:]

            sys.stdout.write(next_char) #印出generated_text

epoch 1
Epoch 1/1
200278/200278 [==============================] - 472s 2ms/step - loss: 1.9867
--- Generating with seed: "ie et sans esprit!

#將“in these later ages, which may be”丟入訓練模型,用不同的溫度來創作文章

229. in these later ages, which may be "
------ temperature: 0.2
ie et sans esprit!
229. in these later ages, which may be the such the still and the sure and and still the present the sure the man and the presenter the for the still and the sure of the from the still the man be the string the man becount and still the the the the sure the still that the soul and a more of the sure the still the stright the stright and the still the sure the sure and the moral the still the sure the sure and the still the sure and the

------ temperature: 0.5
l the still the sure the sure and the still the sure and the perhaps so this disto the philosopher that the string and will the super; and still the same with the desilse of the shill ana
suld conter the free solition of the than the sure for the sure desilse of presentate of the soulh and the for the histances to litely than this incression, and from the preale of contention, in the precestion, that the present to the discienteness than the is and are thi

------ temperature: 1.0
hat the present to the discienteness than the is and are thing, the musf lattire tercies somolian of remord hister
dolecy, with men ye of suses, is and a
corstare--thas and sole of the consciencly to yees as lose. the denchsion in the fantific fan and stone, and trung with their cincolne, and spoled asted to som suef
agage well in regeraving of real the spirits
mistodicato high returdisming
powhing.--as yre? of sulf the doven weally froe it wat give wo le


