WebRTC视频采集、编码和发送过程
二、摄像头采集、时间戳设置以及数据传递过程VideoCaptureImpl是视频采集的实现类,各个平台都会实现它的子类,子类中会做平台相关的具体实现。子类中采集到的Frame数据都是通过VideoCaptureImpl::IncomingFrame传递进来。如Android平台具体实现的子类为VideoCaptureAndroid,Linux平台为VideoCaptureModuleV4L2。
目录
一、时间戳定义
首先,需要罗列下代码中对时间计算的定义,便于后面阅读代码有更好的理解思路。
1、 NTP时间
NtpTime RealTimeClock::CurrentNtpTime() //获取从1900-01-01 00:00.00到当前时刻经过的时间
int64_t RealTimeClock::CurrentNtpInMilliseconds() //获取从1900-01-01 00:00.00到当前时刻经过的毫秒数,ms
int64_t rtc::TimeUTCMicros() //获取从1970-01-01 00:00.00到当前时刻经过的时间,us
int64_t rtc::TimeUTCMillis() //获取从1970-01-01 00:00.00到当前时刻经过的时间,ms
int64_t NtpOffsetMsCalledOnce() //获取ntp时间与本机时间的差值,ms
int64_t NtpOffsetMs() //同NtpOffsetMsCalledOnce()
NtpTime TimeMicrosToNtp(int64_t time_us) //转换本机时间为ntp时间
2、本地时间
从系统启动这一刻起开始计时,不受系统时间被用户改变的影响。
int64_t rtc::TimeMillis() //获取毫秒 ms
int64_t rtc::TimeMicros() //获取微秒 us
int64_t rtc::TimeNanos() //获取纳秒 ns
int64_t RealTimeClock::TimeInMilliseconds() //获取毫秒 ms
int64_t RealTimeClock::TimeInMicroseconds() //获取微秒 us
二、摄像头采集、时间戳设置以及数据传递过程
VideoCaptureImpl是视频采集的实现类,各个平台都会实现它的子类,子类中会做平台相关的具体实现。子类中采集到的Frame数据都是通过VideoCaptureImpl::IncomingFrame传递进来。如Android平台具体实现的子类为VideoCaptureAndroid,Linux平台为VideoCaptureModuleV4L2。
下面以Linux平台为例:
VideoCaptureModuleV4L2采集到数据之后通过如下接口返回:
int32_t VideoCaptureImpl::IncomingFrame(
uint8_t* videoFrame,
int32_t videoFrameLength,
const VideoCaptureCapability& frameInfo,
int64_t captureTime/*=0*/) // must be specified in the NTP time format in milliseconds.
上述接口中captureTime若有值,必须为NTP时间,VideoCaptureModuleV4L2在调用时未传参,因此使用默认值0。
设置frame的时间戳timestamp_us_,并回调至接收者
VideoCaptureImpl::IncomingFrame(captureTime = 0) // captureTime在VideoCaptureModuleV4L2下,传入了默认值0
{
captureFrame.set_timestamp_ms(rtc::TimeMillis())//设置此帧的时间戳,为本地时间;timestamp_rtp_和ntp_time_ms_还未赋值,都为0;
{
VideoCaptureImpl::DeliverCapturedFrame(captureFrame)// 传递采集的视频帧
{
_dataCallBack->OnFrame(captureFrame);//将帧回调出去
}
}
}
1、传递至编码器
void VideoStreamEncoder::OnFrame(const VideoFrame& video_frame)
设置frame的ntp时间ntp_time_ms_
// Capture time may come from clock with an offset and drift from clock_.
int64_t capture_ntp_time_ms;
if (video_frame.ntp_time_ms() > 0) {//值为0,不会进入
capture_ntp_time_ms = video_frame.ntp_time_ms();
} else if (video_frame.render_time_ms() != 0) {//render_time_ms由timestamp_us_换算过来,本地时间。在采集的时候已经赋值
capture_ntp_time_ms = video_frame.render_time_ms() + delta_ntp_internal_ms_;
} else {
capture_ntp_time_ms = current_time_ms + delta_ntp_internal_ms_;
}
incoming_frame.set_ntp_time_ms(capture_ntp_time_ms);
delta_ntp_internal_ms_的值,在类对象构造函数内进行初始化,为ntp时间与本地时间的差值:
delta_ntp_internal_ms_(clock_->CurrentNtpInMilliseconds() - clock_->TimeInMilliseconds())
设置frame的rtp时间timestamp_rtp_
// Convert NTP time, in ms, to RTP timestamp.
const int kMsToRtpTimestamp = 90;
incoming_frame.set_timestamp(
kMsToRtpTimestamp * static_cast<uint32_t>(incoming_frame.ntp_time_ms()));
至此,此帧的渲染时间戳(timestamp_us_)、采集ntp时间(ntp_time_ms_)和rtp时间戳(timestamp_rtp_)都有值。他们都是表示视频的时间戳,只是不同的表示方式。
忽略编码出现拥塞而丢帧的情况,视频帧将会传递至MaybeEncodeVideoFrame(video_frame)进行编码。
void VideoStreamEncoder::MaybeEncodeVideoFrame(const VideoFrame& video_frame,
int64_t time_when_posted_us) {
// skip other code
EncodeVideoFrame(video_frame, time_when_posted_us);
}
void VideoStreamEncoder::EncodeVideoFrame(const VideoFrame& video_frame,
int64_t time_when_posted_us) {
// skip other code
VideoFrame out_frame(video_frame);
encoder_->Encode(out_frame, &next_frame_types_);
}
encoder_的创建,跟踪代码,是由InternalEncoderFactory创建
std::unique_ptr<VideoEncoder> InternalEncoderFactory::CreateVideoEncoder(
const SdpVideoFormat& format) {
if (absl::EqualsIgnoreCase(format.name, cricket::kVp8CodecName))
return VP8Encoder::Create();
if (absl::EqualsIgnoreCase(format.name, cricket::kVp9CodecName))
return VP9Encoder::Create(cricket::VideoCodec(format));
if (absl::EqualsIgnoreCase(format.name, cricket::kH264CodecName))
return H264Encoder::Create(cricket::VideoCodec(format));
if (kIsLibaomAv1EncoderSupported &&
absl::EqualsIgnoreCase(format.name, cricket::kAv1CodecName))
return CreateLibaomAv1Encoder();
RTC_LOG(LS_ERROR) << "Trying to created encoder of unsupported format "
<< format.name;
return nullptr;
}
std::unique_ptr<H264Encoder> H264Encoder::Create(
const cricket::VideoCodec& codec) {
RTC_DCHECK(H264Encoder::IsSupported());
#if defined(WEBRTC_USE_H264)
RTC_CHECK(g_rtc_use_h264);
RTC_LOG(LS_INFO) << "Creating H264EncoderImpl.";
return std::make_unique<H264EncoderImpl>(codec);
#else
RTC_NOTREACHED();
return nullptr;
#endif
}
以编码h264为例,编码过程在如下函数中进行:
int32_t H264EncoderImpl::Encode(
const VideoFrame& input_frame,
const std::vector<VideoFrameType>* frame_types) {
rtc::scoped_refptr<const I420BufferInterface> frame_buffer =
input_frame.video_frame_buffer()->ToI420();
// Encode image for each layer.
for (size_t i = 0; i < encoders_.size(); ++i) {
// EncodeFrame input.
pictures_[i] = {0};
pictures_[i].iPicWidth = configurations_[i].width;
pictures_[i].iPicHeight = configurations_[i].height;
pictures_[i].iColorFormat = EVideoFormatType::videoFormatI420;
pictures_[i].uiTimeStamp = input_frame.ntp_time_ms();//编码时间戳使用了ntp时间
// Downscale images on second and ongoing layers.
if (i == 0) {
pictures_[i].iStride[0] = frame_buffer->StrideY();
pictures_[i].iStride[1] = frame_buffer->StrideU();
pictures_[i].iStride[2] = frame_buffer->StrideV();
pictures_[i].pData[0] = const_cast<uint8_t*>(frame_buffer->DataY());
pictures_[i].pData[1] = const_cast<uint8_t*>(frame_buffer->DataU());
pictures_[i].pData[2] = const_cast<uint8_t*>(frame_buffer->DataV());
} else {
// skip the code
}
// Encode!
encoders_[i]->EncodeFrame(&pictures_[i], &info);
encoded_images_[i]._encodedWidth = configurations_[i].width;
encoded_images_[i]._encodedHeight = configurations_[i].height;
encoded_images_[i].SetTimestamp(input_frame.timestamp());//设置rtp时间timestamp_rtp_。capture_time_ms_未进行设置,默认为0
encoded_images_[i]._frameType = ConvertToVideoFrameType(info.eFrameType);
encoded_images_[i].SetSpatialIndex(configurations_[i].simulcast_idx);
// Split encoded image up into fragments. This also updates
// |encoded_image_|.
// 编码后,编码数据保存在info中,RtpFragmentize将编码数据拷贝到encoded_images_[i]中,并将其中的nalu信息统计在frag_header内
RTPFragmentationHeader frag_header;
RtpFragmentize(&encoded_images_[i], &info, &frag_header);
// 编码成功后,将数据回调出去,接收者即为VideoStreamEncoder
encoded_image_callback_->OnEncodedImage(encoded_images_[i],
&codec_specific, &frag_header);
}
}
回调回VideoStreamEncoder
EncodedImageCallback::Result VideoStreamEncoder::OnEncodedImage(
const EncodedImage& encoded_image,
const CodecSpecificInfo* codec_specific_info,
const RTPFragmentationHeader* fragmentation) {
EncodedImageCallback::Result result = sink_->OnEncodedImage(
image_copy, codec_info_copy ? codec_info_copy.get() : codec_specific_info,
fragmentation_copy ? fragmentation_copy.get() : fragmentation);
}
回调至VideoSendStreamImpl,由rtp_video_sender_进行数据包的封装和发送
EncodedImageCallback::Result VideoSendStreamImpl::OnEncodedImage(
const EncodedImage& encoded_image,
const CodecSpecificInfo* codec_specific_info,
const RTPFragmentationHeader* fragmentation) {
EncodedImageCallback::Result result(EncodedImageCallback::Result::OK);
result = rtp_video_sender_->OnEncodedImage(encoded_image, codec_specific_info,
fragmentation);
}
RtpVideoSender进行rtp包的发送和rtcp sr包的发送
EncodedImageCallback::Result RtpVideoSender::OnEncodedImage(
const EncodedImage& encoded_image,
const CodecSpecificInfo* codec_specific_info,
const RTPFragmentationHeader* fragmentation) {
// 计算rtp时间戳。需要添加StartTimestamp的增量,StartTimestamp是默认值是一个随机数
uint32_t rtp_timestamp =
encoded_image.Timestamp() +
rtp_streams_[stream_index].rtp_rtcp->StartTimestamp();
// RTCPSender has it's own copy of the timestamp offset, added in
// RTCPSender::BuildSR, hence we must not add the in the offset for this call.
// TODO(nisse): Delete RTCPSender:timestamp_offset_, and see if we can confine
// knowledge of the offset to a single place.
// RTCPSender内部已经存在一份timestamp offset,在OnSendingRtpFrame传入Timestamp的时候
// 无需添加offset。
// TODO: 删除RTCPSender:timestamp_offset_,限制只在一处放置此offset值
if (!rtp_streams_[stream_index].rtp_rtcp->OnSendingRtpFrame(
encoded_image.Timestamp(), encoded_image.capture_time_ms_,// 未发现capture_time_ms_赋值处
rtp_config_.payload_type,
encoded_image._frameType == VideoFrameType::kVideoFrameKey)) {
// The payload router could be active but this module isn't sending.
return Result(Result::ERROR_SEND_FAILED);
}
bool send_result = rtp_streams_[stream_index].sender_video->SendEncodedImage(
rtp_config_.payload_type, codec_type_, rtp_timestamp, encoded_image,
fragmentation,
params_[stream_index].GetRtpVideoHeader(
encoded_image, codec_specific_info, shared_frame_id_),
expected_retransmission_time_ms);
}
bool RTPSenderVideo::SendEncodedImage(
int payload_type,
absl::optional<VideoCodecType> codec_type,
uint32_t rtp_timestamp,
const EncodedImage& encoded_image,
const RTPFragmentationHeader* fragmentation,
RTPVideoHeader video_header,
absl::optional<int64_t> expected_retransmission_time_ms) {
return SendVideo(payload_type, codec_type, rtp_timestamp,
encoded_image.capture_time_ms_, encoded_image, fragmentation,
video_header, expected_retransmission_time_ms);
}
bool RTPSenderVideo::SendVideo(
int payload_type,
absl::optional<VideoCodecType> codec_type,
uint32_t rtp_timestamp,
int64_t capture_time_ms,
rtc::ArrayView<const uint8_t> payload,
const RTPFragmentationHeader* fragmentation,
RTPVideoHeader video_header,
absl::optional<int64_t> expected_retransmission_time_ms) {
std::unique_ptr<RtpPacketToSend> single_packet =
rtp_sender_->AllocatePacket();
RTC_DCHECK_LE(packet_capacity, single_packet->capacity());
single_packet->SetPayloadType(payload_type);//设置pt
single_packet->SetTimestamp(rtp_timestamp);//设置时间戳
single_packet->set_capture_time_ms(capture_time_ms);
// skip other code
bool first_frame = first_frame_sent_();
std::vector<std::unique_ptr<RtpPacketToSend>> rtp_packets;
for (size_t i = 0; i < num_packets; ++i) {
RtpPacketToSend* packet;
int expected_payload_capacity;
// Choose right packet template:
if (num_packets == 1) {
packet = std::move(single_packet);
expected_payload_capacity =
limits.max_payload_len - limits.single_packet_reduction_len;
} else if (i == 0) {
packet = std::move(first_packet);
expected_payload_capacity =
limits.max_payload_len - limits.first_packet_reduction_len;
} else if (i == num_packets - 1) {
packet = std::move(last_packet);
expected_payload_capacity =
limits.max_payload_len - limits.last_packet_reduction_len;
} else {
packet = std::make_unique<RtpPacketToSend>(*middle_packet);
expected_payload_capacity = limits.max_payload_len;
}
packet->set_first_packet_of_frame(i == 0);
if (!packetizer->NextPacket(packet.get()))// RtpPacketizerH264,取出一个数据包,payload填入packet中
return false;
RTC_DCHECK_LE(packet->payload_size(), expected_payload_capacity);
if (!rtp_sender_->AssignSequenceNumber(packet.get()))// 设置sequence number
return false;
// No FEC protection for upper temporal layers, if used.
bool protect_packet = temporal_id == 0 || temporal_id == kNoTemporalIdx;
packet->set_allow_retransmission(allow_retransmission);
// Put packetization finish timestamp into extension.
if (packet->HasExtension<VideoTimingExtension>()) {
packet->set_packetization_finish_time_ms(clock_->TimeInMilliseconds());
}
// fec 逻辑
if (protect_packet && fec_generator_) {
if (red_enabled() &&
exclude_transport_sequence_number_from_fec_experiment_) {
// See comments at the top of the file why experiment
// "WebRTC-kExcludeTransportSequenceNumberFromFec" is needed in
// conjunction with datagram transport.
// TODO(sukhanov): We may also need to implement it for flexfec_sender
// if we decide to keep this approach in the future.
uint16_t transport_senquence_number;
if (packet->GetExtension<webrtc::TransportSequenceNumber>(
&transport_senquence_number)) {
if (!packet->RemoveExtension(webrtc::TransportSequenceNumber::kId)) {
RTC_NOTREACHED()
<< "Failed to remove transport sequence number, packet="
<< packet->ToString();
}
}
}
fec_generator_->AddPacketAndGenerateFec(*packet);
}
if (red_enabled()) {
// 发送冗余包
std::unique_ptr<RtpPacketToSend> red_packet(new RtpPacketToSend(*packet));
BuildRedPayload(*packet, red_packet.get());
red_packet->SetPayloadType(*red_payload_type_);
// Send |red_packet| instead of |packet| for allocated sequence number.
red_packet->set_packet_type(RtpPacketMediaType::kVideo);
red_packet->set_allow_retransmission(packet->allow_retransmission());
rtp_packets.emplace_back(std::move(red_packet));
} else {
// 发送原始包
packet->set_packet_type(RtpPacketMediaType::kVideo);
rtp_packets.emplace_back(std::move(packet));
}
if (first_frame) {
if (i == 0) {
RTC_LOG(LS_INFO)
<< "Sent first RTP packet of the first video frame (pre-pacer)";
}
if (i == num_packets - 1) {
RTC_LOG(LS_INFO)
<< "Sent last RTP packet of the first video frame (pre-pacer)";
}
}
}
if (fec_generator_) {
// 取出所有fec包,放入发送包队列中
// Fetch any FEC packets generated from the media frame and add them to
// the list of packets to send.
auto fec_packets = fec_generator_->GetFecPackets();
// TODO(bugs.webrtc.org/11340): Move sequence number assignment into
// UlpfecGenerator.
const bool generate_sequence_numbers = !fec_generator_->FecSsrc();
for (auto& fec_packet : fec_packets) {
if (generate_sequence_numbers) {
rtp_sender_->AssignSequenceNumber(fec_packet.get());
}
rtp_packets.emplace_back(std::move(fec_packet));
}
}
// 发送rtp包
LogAndSendToNetwork(std::move(rtp_packets), unpacketized_payload_size);
}
最终rtp发送
void RtpSenderEgress::SendPacket(RtpPacketToSend* packet,
const PacedPacketInfo& pacing_info) {
const bool send_success = SendPacketToNetwork(*packet, options, pacing_info);
}
bool RtpSenderEgress::SendPacketToNetwork(const RtpPacketToSend& packet,
const PacketOptions& options,
const PacedPacketInfo& pacing_info) {
int bytes_sent = -1;
if (transport_) {
UpdateRtpOverhead(packet);
bytes_sent = transport_->SendRtp(packet.data(), packet.size(), options)
? static_cast<int>(packet.size())
: -1;
if (event_log_ && bytes_sent > 0) {
event_log_->Log(std::make_unique<RtcEventRtpPacketOutgoing>(
packet, pacing_info.probe_cluster_id));
}
}
}
对于Rtcp的SR包,其构建过程如下:
std::unique_ptr<rtcp::RtcpPacket> RTCPSender::BuildSR(const RtcpContext& ctx) {
// Timestamp shouldn't be estimated before first media frame.
RTC_DCHECK_GE(last_frame_capture_time_ms_, 0);
// The timestamp of this RTCP packet should be estimated as the timestamp of
// the frame being captured at this moment. We are calculating that
// timestamp as the last frame's timestamp + the time since the last frame
// was captured.
int rtp_rate = rtp_clock_rates_khz_[last_payload_type_];
if (rtp_rate <= 0) {
rtp_rate =
(audio_ ? kBogusRtpRateForAudioRtcp : kVideoPayloadTypeFrequency) /
1000;
}
// Round now_us_ to the closest millisecond, because Ntp time is rounded
// when converted to milliseconds,
uint32_t rtp_timestamp =
timestamp_offset_ + last_rtp_timestamp_ +
((ctx.now_us_ + 500) / 1000 - last_frame_capture_time_ms_) * rtp_rate;
rtcp::SenderReport* report = new rtcp::SenderReport();
report->SetSenderSsrc(ssrc_);
report->SetNtp(TimeMicrosToNtp(ctx.now_us_));//由当前本机时间转换出ntp时间。生成此SR包的ntp时间
report->SetRtpTimestamp(rtp_timestamp);//最后一个包的rtp_timestamp时间+offset增量+经过的时间得到
report->SetPacketCount(ctx.feedback_state_.packets_sent);
report->SetOctetCount(ctx.feedback_state_.media_bytes_sent);
report->SetReportBlocks(CreateReportBlocks(ctx.feedback_state_));
return std::unique_ptr<rtcp::RtcpPacket>(report);
}
以下非最新代码
--> 设置Frame的render_time_ms
1.由于传入的captureTime值为0,设置为本机时间戳
captureFrame.set_render_time_ms(TickTime::MillisecondTimestamp())
2.否则,设置为NTP值减去(NTP值与本机时间戳的差值)
类对象在构造函数内会初始化NTP值与本机时间戳的差值:
delta_ntp_internal_ms_(
Clock::GetRealTimeClock()->CurrentNtpInMilliseconds() - TickTime::MillisecondTimestamp())
captureFrame.set_render_time_ms(capture_time - delta_ntp_internal_ms_);
可以看出计算得到的值其实就是本机时间戳
--> last_capture_time_ = captureFrame.render_time_ms();保存最新的采集时间戳,本地时间戳
--> ViECapturer::OnIncomingCapturedFrame(I420VideoFrame& video_frame)
--> // Make sure we render this frame earlier since we know the render time set
// is slightly off since it's being set when the frame has been received from
// the camera, and not when the camera actually captured the frame.
去除采集的延迟,Android为190ms
video_frame.set_render_time_ms(video_frame.render_time_ms() - FrameDelay());
三、视频帧从采集到编码过程
ViECapturer::ViECaptureProcess()
--> ViECapturer::DeliverI420Frame(I420VideoFrame* video_frame)
--> ViEFrameProviderBase::DeliverFrame()
--> ViERenderer::DeliverFrame() 视频回显窗口
--> ViEEncoder::DeliverFrame()传递到编码器
--> Convert render time, in ms, to RTP timestamp.
const int kMsToRtpTimestamp = 90;
const uint32_t time_stamp = kMsToRtpTimestamp * static_cast<uint32_t>(video_frame->render_time_ms());
video_frame->set_timestamp(time_stamp);
--> VideoCodingModuleImpl::AddVideoFrame()
--> VideoSender::AddVideoFrame
--> VCMGenericEncoder::Encode
--> VideoEncoder::Encode() 对于VP8,这里应该是VideoEncoder子类VP8Encoder
对于H264,这里应该是VideoEncoder子类H264Encoder
--> VCMEncodedFrameCallback::Encoded()
--> VCMPacketizationCallback::SendData()
--> ViEEncoder::SendData()
更多推荐
所有评论(0)