AI Driven Text Generation

Developed an LSTM model to generate text, mimicking the style of Nietzsche's writings.

GitHub Url

Overview

This project uses an LSTM neural network to generate text based on the writings of Friedrich Nietzsche.
The model is trained on a dataset of Nietzsche's works, enabling it to produce new text sequences that mimic his style.

Dataset

Source: The dataset consists of Nietzsche's writings, available online.
Data Download: The text data is fetched from an online repository and read into the system for processing.
Corpus Statistics: The dataset contains 600,893 characters.

Data Preprocessing

Character Indexing: The text is transformed into sequences of characters. Dictionaries mapping characters to indices and vice versa are created.
Sequence Creation: The text is split into overlapping sequences of 40 characters with a step size of 3.
Vectorization: Each character in the sequences is one-hot encoded, resulting in a binary matrix representation.

Model Architecture

LSTM Layer: The model uses an LSTM layer with 128 units to capture temporal dependencies in the text.
Dense Output Layer: A Dense layer with a softmax activation function outputs probabilities for each character in the vocabulary.
Compilation: The model is compiled using the categorical cross-entropy loss function and the Adam optimizer.

Training

Training Setup: The model is trained for 60 epochs with a batch size of 128, using early stopping based on validation loss.
Callbacks: A custom callback function is used to print generated text samples at the end of each epoch, showcasing the model's progress.

Text Generation

Sampling Function: A helper function samples the next character based on the model's output probabilities, introducing randomness to the text generation process.
Diversity: Text generation is performed at different "diversity" settings, which control the randomness of the predictions, ranging from more conservative to more creative outputs.

Results

Model Performance: The model's training and validation accuracy and loss are tracked over epochs, indicating its learning progress.
Generated Text: At the end of training, the model can generate coherent and stylistically consistent text that resembles Nietzsche's writings.

Summary

This project demonstrates the application of LSTM networks for sequence generation tasks. By training on a dataset of Nietzsche's works, the model learns to produce text that captures the thematic and stylistic elements of the source material, showcasing the potential of deep learning in natural language processing.