Только первый трек Воспроизведение AVMutableComposition()

У меня есть AVMutableComposition(). Я пытаюсь применить MULTIPLE AVCompositionTrack одного типа AVMediaTypeVideo в этой единственной композиции. Это связано с тем, что я использую 2 разных источника AVMediaTypeVideo с разными CGSize и preferredTransforms из AVAsset, из которых они берутся.

Таким образом, единственный способ применить свой указанный preferredTransforms - предоставить их в 2 разных дорожках. Но по какой-то причине только первый трек фактически предоставит любое видео, почти так же, как если бы второй трек никогда не был там.

1), используя AVMutableVideoCompositionLayerInstruction и применяя AVVideoComposition вместе с AVAssetExportSession, который работает нормально, я все еще работаю над преобразованиями, но умею. Но время обработки видео WELL OVER 1 минута, что просто неприменимо в моей ситуации.

2) Использование нескольких треков без AVAssetExportSession и 2-го трека того же типа никогда не появляется. Теперь я могу поместить все это на один трек, но все видео будут иметь тот же размер и предпочтительныйTransform, что и первое видео, которое я абсолютно не хочу, поскольку он растягивает их со всех сторон.

1) Применение инструкций к просто дорожке БЕЗ использования AVAssetExportSession?//Предпочтительный путь по FAR.

2) Уменьшить время экспорта? (Я пробовал использовать PresetPassthrough, но вы не можете использовать это, если у вас есть exporter.videoComposition, где есть мои инструкции. Это единственное место, которое я знаю, могу наложить инструкции, не уверен, могу ли я разместить их где-то еще.

Вот некоторые из моего кода (без экспортера, поскольку мне не нужно ничего экспортировать в любом месте, просто делайте что-нибудь после того, как AVMutableComposition объединяет элементы.

Apple заявляет, что "Указывает инструкции для композиции видео через NSArray экземпляров классов, реализующих протокол AVVideoCompositionInstruction. Для первой команды в массиве timeRange.start должен быть меньше или равен самому раннему времени, для которого будет предпринято воспроизведение или другая обработка (обратите внимание, что это обычно kCMTimeZero). Для последующих инструкций timeRange.start должен быть равен времени окончания предыдущей команды. Время окончания последняя команда должна быть больше или равна последнему времени, за которое будет производиться воспроизведение или другая обработка (обратите внимание, что это часто будет продолжительность актива, с которым связан экземпляр AVVideoComposition).

Это означает, что вся композиция должна быть наложена внутри команд, если вы решите использовать ЛЮБЫЕ инструкции (это то, что я понимаю). Почему это? Как бы я просто применил инструкции, чтобы сказать дорожку 2 в этом примере, не применяя при этом изменение трека 1 или 3:

Трек 1 от 0 до 10 секунд, Трек 2 от 10 до 20 секунд, Трек 3 от 20 до 30 секунд.

Любое объяснение этого, вероятно, ответит на мой вопрос (если это выполнимо).

Ответы

Ответ 1

Хорошо, поэтому для моей точной проблемы мне пришлось применить специальные преобразования CGAffineTransform в Swift, чтобы получить конкретный результат, который мы хотели. Текущая, которую я публикую, работает с любой сделанной/полученной картиной, а также с видеороликом

//This method gets the orientation of the current transform. This method is used below to determine the orientation
func orientationFromTransform(_ transform: CGAffineTransform) -> (orientation: UIImageOrientation, isPortrait: Bool) {
    var assetOrientation = UIImageOrientation.up
    var isPortrait = false
    if transform.a == 0 && transform.b == 1.0 && transform.c == -1.0 && transform.d == 0 {
        assetOrientation = .right
        isPortrait = true
    } else if transform.a == 0 && transform.b == -1.0 && transform.c == 1.0 && transform.d == 0 {
        assetOrientation = .left
        isPortrait = true
    } else if transform.a == 1.0 && transform.b == 0 && transform.c == 0 && transform.d == 1.0 {
        assetOrientation = .up
    } else if transform.a == -1.0 && transform.b == 0 && transform.c == 0 && transform.d == -1.0 {
        assetOrientation = .down
    }

    //Returns the orientation as a variable
    return (assetOrientation, isPortrait)
}

//Method that lays out the instructions for each track I am editing and does the transformation on each individual track to get it lined up properly
func videoCompositionInstructionForTrack(_ track: AVCompositionTrack, _ asset: AVAsset) -> AVMutableVideoCompositionLayerInstruction {

    //This method Returns set of instructions from the initial track

    //Create inital instruction
    let instruction = AVMutableVideoCompositionLayerInstruction(assetTrack: track)

    //This is whatever asset you are about to apply instructions to.
    let assetTrack = asset.tracks(withMediaType: AVMediaTypeVideo)[0]

    //Get the original transform of the asset
    var transform = assetTrack.preferredTransform

    //Get the orientation of the asset and determine if it is in portrait or landscape - I forget which, but either if you take a picture or get in the camera roll it is ALWAYS determined as landscape at first, I don't recall which one. This method accounts for it.
    let assetInfo = orientationFromTransform(transform)

    //You need a little background to understand this part. 
    /* MyAsset is my original video. I need to combine a lot of other segments, according to the user, into this original video. So I have to make all the other videos fit this size. 
      This is the width and height ratios from the original video divided by the new asset 
    */
    let width = MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.width/assetTrack.naturalSize.width
    var height = MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.height

    //If it is in portrait
    if assetInfo.isPortrait {

        //We actually change the height variable to divide by the width of the old asset instead of the height. This is because of the flip since we determined it is portrait and not landscape. 
        height = MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.width

        //We apply the transform and scale the image appropriately.
        transform = transform.scaledBy(x: height, y: height)

        //We also have to move the image or video appropriately. Since we scaled it, it could be wayy off on the side, outside the bounds of the viewing.
        let movement = ((1/height)*assetTrack.naturalSize.height)-assetTrack.naturalSize.height

        //This lines it up dead center on the left side of the screen perfectly. Now we want to center it.
        transform = transform.translatedBy(x: 0, y: movement)

        //This calculates how much black there is. Cut it in half and there you go!
        let totalBlackDistance = MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.width-transform.tx
        transform = transform.translatedBy(x: 0, y: -(totalBlackDistance/2)*(1/height))

    } else {

        //Landscape! We don't need to change the variables, it is all defaulted that way (iOS prefers landscape items), so we scale it appropriately.
        transform = transform.scaledBy(x: width, y: height)

        //This is a little complicated haha. So because it is in landscape, the asset fits the height correctly, for me anyway; It was just extra long. Think of this as a ratio. I forgot exactly how I thought this through, but the end product looked like: Answer = ((Original height/current asset height)*(current asset width))/(Original width)
        let scale:CGFloat = ((MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.height)*(assetTrack.naturalSize.width))/MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.width
        transform = transform.scaledBy(x: scale, y: 1)

        //The asset can be way off the screen again, so we have to move it back. This time we can have it dead center in the middle, because it wasn't backwards because it wasn't flipped because it was landscape. Again, another long complicated algorithm I derived.
        let movement = ((MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.width-((MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.height)*(assetTrack.naturalSize.width)))/2)*(1/MyAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize.height/assetTrack.naturalSize.height)
        transform = transform.translatedBy(x: movement, y: 0)
    }

    //This creates the instruction and returns it so we can apply it to each individual track.
    instruction.setTransform(transform, at: kCMTimeZero)
    return instruction
}

Теперь, когда у нас есть эти методы, мы можем применить правильные и соответствующие преобразования к нашим активам соответствующим образом и получить все подходящее и чистое.

func merge() {
if let firstAsset = MyAsset, let newAsset = newAsset {

        //This creates our overall composition, our new video framework
        let mixComposition = AVMutableComposition()

        //One by one you create tracks (could use loop, but I just had 3 cases)
        let firstTrack = mixComposition.addMutableTrack(withMediaType: AVMediaTypeVideo,
                                                                     preferredTrackID: Int32(kCMPersistentTrackID_Invalid))

        //You have to use a try, so need a do
        do {

            //Inserting a timerange into a track. I already calculated my time, I call it startTime. This is where you would put your time. The preferredTimeScale doesn't have to be 600000 haha, I was playing with those numbers. It just allows precision. At is not where it begins within this individual track, but where it starts as a whole. As you notice below my At times are different You also need to give it which track 
            try firstTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, CMTime(seconds: CMTimeGetSeconds(startTime), preferredTimescale: 600000)),
                                           of: firstAsset.tracks(withMediaType: AVMediaTypeVideo)[0],
                                           at: kCMTimeZero)
        } catch _ {
            print("Failed to load first track")
        }

        //Create the 2nd track
        let secondTrack = mixComposition.addMutableTrack(withMediaType: AVMediaTypeVideo,
                                                                      preferredTrackID: Int32(kCMPersistentTrackID_Invalid))

        do {

            //Apply the 2nd timeRange you have. Also apply the correct track you want
            try secondTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, self.endTime-self.startTime),
                                           of: newAsset.tracks(withMediaType: AVMediaTypeVideo)[0],
                                           at: CMTime(seconds: CMTimeGetSeconds(startTime), preferredTimescale: 600000))
            secondTrack.preferredTransform = newAsset.preferredTransform
        } catch _ {
            print("Failed to load second track")
        }

        //We are not sure we are going to use the third track in my case, because they can edit to the end of the original video, causing us not to use a third track. But if we do, it is the same as the others!
        var thirdTrack:AVMutableCompositionTrack!
        if(self.endTime != controller.realDuration) {
            thirdTrack = mixComposition.addMutableTrack(withMediaType: AVMediaTypeVideo,
                                                                      preferredTrackID: Int32(kCMPersistentTrackID_Invalid))

        //This part appears again, at endTime which is right after the 2nd track is suppose to end.
            do {
                try thirdTrack.insertTimeRange(CMTimeRangeMake(CMTime(seconds: CMTimeGetSeconds(endTime), preferredTimescale: 600000), self.controller.realDuration-endTime),
                                           of: firstAsset.tracks(withMediaType: AVMediaTypeVideo)[0] ,
                                           at: CMTime(seconds: CMTimeGetSeconds(endTime), preferredTimescale: 600000))
            } catch _ {
                print("failed")
            }
        }

        //Same thing with audio!
        if let loadedAudioAsset = controller.audioAsset {
            let audioTrack = mixComposition.addMutableTrack(withMediaType: AVMediaTypeAudio, preferredTrackID: 0)
            do {
                try audioTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, self.controller.realDuration),
                                               of: loadedAudioAsset.tracks(withMediaType: AVMediaTypeAudio)[0] ,
                                               at: kCMTimeZero)
            } catch _ {
                print("Failed to load Audio track")
            }
        }

        //So, now that we have all of these tracks we need to apply those instructions! If we don't, then they could be different sizes. Say my newAsset is 720x1080 and MyAsset is 1440x900 (These are just examples haha), then it would look a tad funky and possibly not show our new asset at all.
        let mainInstruction = AVMutableVideoCompositionInstruction()

        //Make sure the overall time range matches that of the individual tracks, if not, it could cause errors. 
        mainInstruction.timeRange = CMTimeRangeMake(kCMTimeZero, self.controller.realDuration)

        //For each track we made, we need an instruction. Could set loop or do individually as such.
        let firstInstruction = videoCompositionInstructionForTrack(firstTrack, firstAsset)
        //You know, not 100% why this is here. This is 1 thing I did not look into well enough or understand enough to describe to you. 
        firstInstruction.setOpacity(0.0, at: startTime)

        //Next Instruction
        let secondInstruction = videoCompositionInstructionForTrack(secondTrack, self.asset)

        //Again, not sure we need 3rd one, but if we do.
        var thirdInstruction:AVMutableVideoCompositionLayerInstruction!
        if(self.endTime != self.controller.realDuration) {
            secondInstruction.setOpacity(0.0, at: endTime)
            thirdInstruction = videoCompositionInstructionForTrack(thirdTrack, firstAsset)
        }

        //Okay, now that we have all these instructions, we tie them into the main instruction we created above.
        mainInstruction.layerInstructions = [firstInstruction, secondInstruction]
        if(self.endTime != self.controller.realDuration) {
            mainInstruction.layerInstructions += [thirdInstruction]
        }

        //We create a video framework now, slightly different than the one above.
        let mainComposition = AVMutableVideoComposition()

        //We apply these instructions to the framework
        mainComposition.instructions = [mainInstruction]

        //How long are our frames, you can change this as necessary
        mainComposition.frameDuration = CMTimeMake(1, 30)

        //This is your render size of the video. 720p, 1080p etc. You set it!
        mainComposition.renderSize = firstAsset.tracks(withMediaType: AVMediaTypeVideo)[0].naturalSize

        //We create an export session (you can't use PresetPassthrough because we are manipulating the transforms of the videos and the quality, so I just set it to highest)
        guard let exporter = AVAssetExportSession(asset: mixComposition, presetName: AVAssetExportPresetHighestQuality) else { return }

        //Provide type of file, provide the url location you want exported to (I don't have mine posted in this example).
        exporter.outputFileType = AVFileTypeMPEG4
        exporter.outputURL = url

        //Then we tell the exporter to export the video according to our video framework, and it does the work!
        exporter.videoComposition = mainComposition

        //Asynchronous methods FTW!
        exporter.exportAsynchronously(completionHandler: {
            //Do whatever when it finishes!
        })
    }
}

Здесь многое происходит, но это нужно сделать для моего примера в любом случае! Извините, мне потребовалось столько времени, чтобы опубликовать и сообщить, если у вас есть вопросы.

Ответ 2

Да, вы можете полностью применить индивидуальное преобразование к каждому слою AVMutableComposition.

Вот обзор процесса - я сделал это лично в Objective-C, хотя я не могу дать вам точный быстрый код, но я знаю, что эти же функции работают одинаково в Swift.

Создайте AVMutableComposition.
Создайте AVMutableVideoComposition.
Задайте размер рендеринга и продолжительность кадра видеокомпозиции.
Теперь для каждого AVAsset:
- Создайте AVAssetTrack и AVAudioTrack.
- Создайте AVMutableCompositionTrack для каждого из них (один для видео, один для аудио), добавив каждый в mutableComposition.

здесь он усложняется.. (извините AVFoundation не просто!)

Создайте AVMutableCompositionLayerInstruction из AVAssetTrack, который относится к каждому видео. Для каждого AVMutableCompositionLayerInstruction вы можете установить на нем преобразование. Вы также можете делать такие вещи, как задать прямоугольник обрезки.
Добавьте каждый AVMutableCompositionLayerInstruction в массив описателей layerinstructions. Когда все AVMutableCompositionLayerInstructions создаются, массив устанавливается в AVMutableVideoComposition.

И наконец..

И, наконец, у вас будет AVPlayerItem, который вы будете использовать для воспроизведения этой записи (на AVPlayer). Вы создаете AVPlayerItem с помощью AVMutableComposition, а затем устанавливаете AVMutableVideoComposition на самом AVPlayerItem (setVideoComposition..)

Легко ех?

Мне потребовались несколько недель, чтобы заставить этот материал работать хорошо. Его полностью неумолимый и, как вы уже упоминали, если вы что-то не так, он не говорит вам, что вы сделали неправильно - он просто не появляется.

Но когда вы его взламываете, он работает быстро и хорошо.

Наконец, все материалы, которые я изложил, доступны в документах AVFoundation. Его длинный томе, но вы должны знать его, чтобы достичь того, что вы пытаетесь сделать.

Удачи!