Thursday, December 31, 2020

Playing one sentence from an audio file

 

I’m using Lazarus to develop some tools for myself for language immersion. What I want is to hear the same sentence again and again, until I can fully understand it, I know what it means, I can recognize the individual words (and in which form they are), and I can reproduce it. That takes (depending on the language, the sentence, its length, and level) a lot of repetitions. I usually prefer seven.

DuoLingo is a great app - but you have to press that sound button again and again. Besides that, the voices are artifical. Also in Rosetta Stone you have to press the sound button over and over. Apart from that, Rosetta Stone has the advantage that the voices are human, and that you don’t see the translation (that is usually an advantage, but can in some cases be a drawback too).

Anyway, I’ve a program which plays sentences again and again. I feed the program with sentences from books. I split the book text into sentences (with a home made tool) and I add translations.

Yes, translations: sometimes I need them to understand what the reader is saying, and in most (literary) books a vocabulary is used that exceeds my current level. And the books I use are usually older because I get the voices from sites like archive.org or librevox, and these are books in the public domain.

I’m using deepl.com for translations. Their translations are acceptable to understand what’s going on. There can be errors but I can correct them. Deepl translations are, in my experience, better than translations from yandex or google.

So my program reads aloud sentences again and again, but how? Using bass.dll! That’s a cool library which is free (for personal use) and works on Windows and on Linux (and more), and works with Lazarus (and more). Its documentation is good but lacks some examples. For instance, I want to repeat that sentence over and over...

Playing a sentence (as a part of a large mp3) is possible with bass.dll and Lazarus. You can play from a certain position and add a timer, and stop the sound on the timer event. But timers are in general cumbersome and imprecise. It works most of the time but... if you’re computer is busy with some other task, you sometimes hear the start of the next sentence. And even if that’s just the start (say a “ST”), it’s annoying. On Linux, if I run it on my Windows computer in a virtual box, it’s even worse: most of the time it exceeds the scheduled time. So a timer can’t be used for this purpose.

Bass.dll offers another way to play a part of an mp3 and that is a sync (BASS_ChannelSetSync). That works but... if I also get sound levels in between (and I want that too), it often crashes. It apparently can’t do that during the loop back and we can’t exactly predict when that happens. I tried to fix that but I couldn’t, so I gave up. Besides this, the code was more complex because of the callbacks involved.

But there are more possibilities. I found two ways which both work. The first way is to write a WAV file either on memory or even on disk, and play that WAV. There is a source code example converting to WAV (to disk) so that was easy to adapt, and in memory it works the same way. The second way is to use a bass “sample” and play that back. There were no complete examples of that, but combining some other examples I got it working. This is the method I finally selected because it is, code wise, simple, and precise (without any timer involved) and reliable (no crashes observed).

A minimal sample doing this is presented here below and I hope it is useful for anyone who wants to do a similar thing. The code here below is Lazarus (free pascal) code, but it will be straightforward to use it in another language.

procedure PlayPart;
const bufferSize = 10000;
var
inputChannel: HSTREAM;
info: BASS_CHANNELINFO;
outputChannel : HSTREAM;
sample : HSAMPLE;

buffer : array [1..bufferSize] of byte;
b1, b2 : QWORD;
bytesRead : DWORD;
memo : TMemoryStream;

begin
// You should open the channel in the decode mode
inputChannel := BASS_StreamCreateFile(false, PChar('c:\data\books\Gaidar\DrummersFate\audio\01.mp3'), 0, 0, BASS_STREAM_DECODE);
if inputChannel <= 0 then exit;

BASS_ChannelGetInfo(inputChannel, info);

// Specify here the begin and end times
b1 := BASS_ChannelSeconds2Bytes(inputChannel, 223.0);
b2 := BASS_ChannelSeconds2Bytes(inputChannel, 225.33);

BASS_ChannelSetPosition(inputChannel, b1, BASS_POS_BYTE);

memo := TMemoryStream.Create;
try
while (BASS_ChannelIsActive(inputChannel) = 1) and (b1 < b2) do
begin
bytesRead := BASS_ChannelGetData(inputChannel, @buffer, min(bufferSize, b2 - b1));
memo.Write(buffer, bytesRead);
inc(b1, bytesRead);
end;
BASS_StreamFree(inputChannel);

// Create a sample, with the input sound specifications, and max one simultaneous playback.
// Specify a flag to automatically repeat it.
// If that is not necessary, a flag of 0 is fine for this example.
sample := BASS_SampleCreate(memo.size, info.freq, info.chans, 1, BASS_SAMPLE_LOOP);

// Fill it with the read data
BASS_SampleSetData(sample, memo.memory);
finally
memo.free;
end;

// Create a channel (this is required) and play it.
outputChannel := BASS_SampleGetChannel(sample, true);
BASS_ChannelPlay(outputChannel, true);
end;

This is the whole code (except for initialization of the bass library itself, but that should always be done, once, and is straightforward).

It will play this sentence:



As far as I know, it's not possible to save this part as an mp3 (as published here). But you can save it as a WAV file. There are numerous tools to extract parts from a large mp3 and save them. What I needed was to play a part, with software. And for that I use bass and Lazarus.