Generate personalized audio lessons with GPT and Azure AI speech.
A beginners course in Japanese can be found here, or as a single .mp3.
A slightly higher level Japanese course generated with the following input:
Target language: Japanese
Prior knowledge: I've done 100 lessons on duolingo, so I know words like hello, goodbye, some sentences like where is, my name is, some colors, how to say where and there
Target knowledge: Enough to be able to enjoy a three week vacation`;
can be found here, or as a single .mp3.
output.mp4
Out of the text-to-speech solutions I tried, this was the only one I was able to get to produce good bilingual results within a single paragraph, by specifying language with the Speech Synthesis Markup Language. As a bonus, generating all the lessons during testing and finalization (several hundred lessons, 4-5 hours of conversation) was well within the Azure free tier.
git clone https://github.com/adrianmfi/gpt-tutor.git
cd gpt-tutor
npm install
- Get an API Key from OpenAI.
- If you want to use GPT-4 for the first time, you might have to prepurchase credits to get access.
- Create a speech resource as described here
- The resource must be in East US, West Europe, or Southeast Asia as currently only these regions support the Multilingual voice
Run with:
npx ts-node ./src/create-audio-book.mts
or:
OPENAI_API_KEY=... AZURE_SPEECH_KEY=... AZURE_SPEECH_REGION=... npx ts-node ./src/create-audio-book.mts
Concatenate all generated .mp3 files in a directory into a single file for easier download (Tested on macOS):
ffmpeg -f concat -safe 0 -i <(for f in *.mp3; do echo "file '$PWD/$f'"; done | sort -V) -c copy output.mp3
MIT