PCM (CMSampleBufferRef) in AAC unter iOS kodieren – Wie stelle ich Frequenz und Bitrate ein?

Anonymous · Post by **Anonymous** » 28 Dec 2025, 18:00

Ich möchte PCM kodieren (

CMSampleBufferRef

(s) gehen live von AVCaptureAudioDataOutputSampleBufferDelegate) in AAC.

Wenn das erste CMSampleBufferRef eintrifft, setze ich beide (in/out) AudioStreamBasicDescription(s), „out“ gemäß der Dokumentation

Code: Select all

AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer));

AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...
outAudioStreamBasicDescription.mSampleRate = 44100; // The number of frames per second of the data in the stream, when the stream is played at normal speed. For compressed formats, this field indicates the number of frames per second of equivalent decompressed data. The mSampleRate field must be nonzero, except when this structure is used in a listing of supported formats (see “kAudioStreamAnyRate”).
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // kAudioFormatMPEG4AAC_HE does not work. Can't find `AudioClassDescription`. `mFormatFlags` is set to 0.
outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_SSR; // Format-specific flags to specify details of the format. Set to 0 to indicate no format flags. See “Audio Data Format Identifiers” for the flags that apply to each format.
outAudioStreamBasicDescription.mBytesPerPacket = 0; // The number of bytes in a packet of audio data. To indicate variable packet size, set this field to 0. For a format that uses variable packet size, specify the size of each packet using an AudioStreamPacketDescription structure.
outAudioStreamBasicDescription.mFramesPerPacket = 1024; // The number of frames in a packet of audio data. For uncompressed audio, the value is 1. For variable bit-rate formats, the value is a larger fixed number, such as 1024 for AAC. For formats with a variable number of frames per packet, such as Ogg Vorbis, set this field to 0.
outAudioStreamBasicDescription.mBytesPerFrame = 0; // The number of bytes from the start of one frame to the start of the next frame in an audio buffer. Set this field to 0 for compressed formats. ...
outAudioStreamBasicDescription.mChannelsPerFrame = 1; // The number of channels in each frame of audio data. This value must be nonzero.
outAudioStreamBasicDescription.mBitsPerChannel = 0; // ... Set this field to 0 for compressed formats.
outAudioStreamBasicDescription.mReserved = 0; // Pads the structure out to force an even 8-byte alignment.  Must be set to 0.

und AudioConverterRef.

Code: Select all

AudioClassDescription audioClassDescription;
memset(&audioClassDescription, 0, sizeof(audioClassDescription));
UInt32 size;
NSAssert(AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size) == noErr, nil);
uint32_t count = size / sizeof(AudioClassDescription);
AudioClassDescription descriptions[count];
NSAssert(AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size, descriptions) == noErr, nil);
for (uint32_t i = 0; i < count; i++) {

if ((outAudioStreamBasicDescription.mFormatID == descriptions[i].mSubType) && (kAppleSoftwareAudioCodecManufacturer == descriptions[i].mManufacturer)) {

memcpy(&audioClassDescription, &descriptions[i], sizeof(audioClassDescription));

}
}
NSAssert(audioClassDescription.mSubType == outAudioStreamBasicDescription.mFormatID && audioClassDescription.mManufacturer == kAppleSoftwareAudioCodecManufacturer, nil);
AudioConverterRef audioConverter;
memset(&audioConverter, 0, sizeof(audioConverter));
NSAssert(AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, &audioClassDescription, &audioConverter) == 0, nil);

Und dann konvertiere ich jede CMSampleBufferRef in AAC-Rohdaten.

Code: Select all

AudioBufferList inAaudioBufferList;
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, NULL, &inAaudioBufferList, sizeof(inAaudioBufferList), NULL, NULL, 0, &blockBuffer);
NSAssert(inAaudioBufferList.mNumberBuffers == 1, nil);

uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize;
uint8_t *buffer = (uint8_t *)malloc(bufferSize);
memset(buffer, 0, bufferSize);
AudioBufferList outAudioBufferList;
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = inAaudioBufferList.mBuffers[0].mNumberChannels;
outAudioBufferList.mBuffers[0].mDataByteSize = bufferSize;
outAudioBufferList.mBuffers[0].mData = buffer;

UInt32 ioOutputDataPacketSize = 1;

NSAssert(AudioConverterFillComplexBuffer(audioConverter, inInputDataProc, &inAaudioBufferList, &ioOutputDataPacketSize, &outAudioBufferList, NULL) == 0, nil);

NSData *data = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];

free(buffer);
CFRelease(blockBuffer);

Code: Select all

inInputDataProc()

Implementierung:

Code: Select all

OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
AudioBufferList audioBufferList = *(AudioBufferList *)inUserData;

ioData->mBuffers[0].mData = audioBufferList.mBuffers[0].mData;
ioData->mBuffers[0].mDataByteSize = audioBufferList.mBuffers[0].mDataByteSize;

return  noErr;
}

Jetzt enthalten die Daten mein rohes AAC, das ich mit dem richtigen ADTS-Header in einen ADTS-Frame verpacke, und die Reihenfolge dieser ADTS-Frames ist ein abspielbares AAC-Dokument.

Aber ich verstehe diesen Code nicht so gut, wie ich möchte. Im Allgemeinen verstehe ich den Ton nicht ... Ich habe ihn einfach irgendwie nach Blogs, Foren und Dokumenten geschrieben, in ziemlich langer Zeit, und jetzt funktioniert er, aber ich weiß nicht, warum und wie ich einige Parameter ändern soll. Hier sind also meine Fragen:

Ich muss diesen Konverter verwenden, während der HW-Encoder belegt ist (von AVAssetWriter). Aus diesem Grund erstelle ich einen SW-Konverter über AudioConverterNewSpecific() und nicht über AudioConverterNew(). Aber jetzt funktioniert das Setzen von outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC_HE; nicht. AudioClassDescription kann nicht gefunden werden. Auch wenn mFormatFlags auf 0 gesetzt ist. Was verliere ich durch die Verwendung von kAudioFormatMPEG4AAC (
Code: Select all
```
kMPEG4Object_AAC_SSR
```
) über kAudioFormatMPEG4AAC_HE? Was sollte ich für den Livestream verwenden? kMPEG4Object_AAC_SSR oder kMPEG4Object_AAC_Main?
Wie ändert man die Abtastrate richtig? Wenn ich outAudioStreamBasicDescription.mSampleRate beispielsweise auf 22050 oder 8000 setze, ist die Audiowiedergabe verlangsamt. Ich habe den Sampling-Frequenzindex im ADTS-Header auf die gleiche Frequenz wie outAudioStreamBasicDescription.mSampleRate eingestellt.
Wie ändere ich die Bitrate? ffmpeg -i zeigt diese Informationen für produzierte aac:
Code: Select all
```
Stream #0:0: Audio: aac, 44100 Hz, mono, fltp, 64 kb/s
```
.
Wie kann man es beispielsweise auf 16 kbps ändern? Die Bitrate nimmt ab, wenn ich die Frequenz verringere, aber ich glaube, das ist nicht der einzige Weg? Und die Wiedergabe wird durch die Verringerung der Frequenz beschädigt, wie ich in 2 erwähnt habe.
Wie berechnet man die Größe des Puffers? Jetzt setze ich es auf uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize;, da ich glaube, dass das komprimierte Format nicht größer als das unkomprimierte ist ... Aber ist es nicht unnötig zu viel?
Wie stellt man ioOutputDataPacketSize richtig ein? Wenn ich die Dokumentation richtig hinbekomme, sollte ich sie als UInt32 ioOutputDataPacketSize = bufferSize / outAudioStreamBasicDescription.mBytesPerPacket; festlegen, aber mBytesPerPacket ist 0. Wenn ich sie auf 0 setze, gibt AudioConverterFillComplexBuffer() einen Fehler zurück. Wenn ich es auf 1 setze, funktioniert es, aber ich weiß nicht warum...
In inInputDataProc() gibt es 3 „out“-Parameter. Ich habe nur ioData eingestellt. Sollte ich auch ioNumberDataPackets und outDataPacketDescription festlegen? Warum und wie?

PCM (CMSampleBufferRef) in AAC unter iOS kodieren – Wie stelle ich Frequenz und Bitrate ein?

PCM (CMSampleBufferRef) in AAC unter iOS kodieren – Wie stelle ich Frequenz und Bitrate ein? ⇐ IOS

Quick Reply