Webinar Replay: What’s New in Captioning

When the NAB Show went virtual, so did EEG Video and Ai-Media!

We brought attendees together for our NAB Virtual Event Webinar Series, where we presented three essential webinars to cover closed captioning’s most recent developments.

Now our newest live, free-to-attend webinar is available. On Tuesday, October 12th we presented What’s New in Captioning. This webinar walked attendees through the latest innovations in closed captioning, from solutions for broadcast to multilingual services. Bill McLaughlin, CTO at EEG, unveiled new product updates and previewed our next releases.

What's New in Captioning • October 12, 2021

The combined strengths of EEG Video and Ai-Media form a one-stop-shop resource for captioning, translation and transcription solutions. Look to us for the latest closed captioning news, tips, and advanced techniques.

Visit here to see What’s New in Captioning and all previous webinars!

Transcript

Regina Vilenskaya: On behalf of Ai-Media and EEG, I'm very happy to welcome you to the NAB Virtual Event Webinar Series. I'm excited to introduce Tony Abrahams, CEO of Ai-Media, and Phil McLaughlin, CEO of EEG. Tony?

Tony Abrahams: Thanks very much, Regina. My name is Tony Abrahams, I'm the co-founder and CEO of Ai-Media, and I'm devastated that we can't all be together at NAB in person this year. But we've tried to do the next best thing which is to make all of our great content available to all of you virtually. And I'm delighted to be joined by a person who's very well known to almost all of you I'm sure, the CEO of EEG, Phil McLaughlin. Phil, how are you?

Phil McLaughlin: Oh, very good, Tony. Glad to be with you.

Tony Abrahams: And I think looking at one thing that's obviously changed since the last time we appeared at NAB is that Ai-Media and EEG have joined forces and Phil after running that business for 25-odd years, what had you decide to join forces with Ai-Media. I guess two questions, why now and why Ai-Media?

Phil McLaughlin: This has been very heavy on my mind the last couple of years is we're the company that's built this level of excellence and momentum in the US market, where we can move from here. And the two things that happened is that we needed more international exposure as a company and we also, with our movement into captioning services with our Lexi automatic captioning product, we had become a significant service provider in the US market. And from that standpoint, for the last five years, we've had the pleasure of dealing with the Ai-Media as a trusted partner, a partner with customers in both the U.S., Canada, and all around the world. And we thought what would happen here, which is very much coming to fruition now, is that we could join together with Ai-Media where we can combine our products, where we have almost no overlap, but a tremendous amount of compatibility of these products. And we could bring these products both to the U.S. market, products and services, I should say, both the U.S. market and around the world.

Tony Abrahams: And I think also what we've been able to offer that's been really compelling, only in the last few months, we only completed this acquisition in May, is really offering that true one-stop shop right around the world where we are offering both the fully automated captioning through EEG's leading Lexi system, the traditional premium quality captioning that Ai-Media is known for, for many, many years, delivering accuracy of over 99.5%. But also, there's this really interesting spot in the middle, isn't it, with Smart Lexi that both Ai-Media had invented kind of moving down from a premium and that you'd moved into by moving up from Lexi and it's this happy place that we meet with Smart Lexi. So hopefully that's something for everyone in this great product suite. So Phil, thanks very much for joining us. And Regina, I'll hand back to you.

Phil McLaughlin: It was a pleasure. Thank you, Tony.

Regina Vilenskaya: Thanks, Tony and Phil, if you would like to hear about any of these topics or products in more detail, please join us for the NAB Virtual Event Webinar Series. We look forward to having you join us. So thank you. Hello, everyone, and thank you for tuning in today for what's new in captioning. This is the first of three webinars that are being hosted by EEG and Ai-Media as part of the NAB Virtual Event Webinar Series.

My name is Regina Vilenskaya, and I'm the marketing lead at EEG. With me on this webinar is Bill McLaughlin, EEG's Chief Technical Officer. Today, Bill will walk through recent and upcoming releases across EEG and Ai-Media's range of captioning and subtitling solutions. We'll finish the webinar with a live Q&A session, so if you have any questions, please enter them at the bottom of your Zoom window in the Q&A tool. So with that, I'm now going to pass it over to Bill. Welcome, Bill and over to you.

Bill McLaughlin: Thank you. Alright. So, hello, welcome, and thank you so much for taking some time in your day today to be with us. I was intending to be presenting this webinar with James Ward, who is the UK-based chief sales officer of Ai-Media. We're going to do a bit of a joint presentation from both sides of the company. James has unfortunately came down sick today and isn't going to be able to be with us. So I'm providing the full experience for you and I will try to guide you through everything. I've certainly had some time to learn it in the past few months, getting all the Ai-Media material and watching as these companies really merge and build the products together.

So we were really looking forward to NAB as a bit of a coming-out show and really give us a chance to speak to our customers and our partners and talk about what's been going on in this new unified business. And in general, it's of course, been a long time since we've really had one of these industry events to have that kind of open conversation and talk about the progress that we're seeing.

We usually, in the spirit of NAB, we'll have a lot of caption newbies just looking for information for the first time and also meet a lot of old friends, I'm sure from both sides of the business, very familiar with the solutions from these groups and looking to hear what's new. So in the spirit of that, we're mostly going to be focused today on kind of what's new in the product range and in the solutions we're offering. There is going to be another webinar next week, a closed caption 101 that's a little bit more focused on just getting started and how to kind of understand the different terminology, how to understand what types of caption delivery and services might be appropriate for a given type of events where you've never seen captioning before. So we're going to talk today mostly about live captioning. And live captioning is a little bit of a specialty compared to VOD or caption of video clips. Live captioning often requires a more specialized workforce and more specialized technology, and it's where both of these companies' specialties really unite.

So we'll be talking about that today from the perspective of the service which provides transcription or translation that actually moves the spoken word into writing both for accessibility and for other organization and search benefits, and also talk about the delivery workflows that help that accessibility data get into your video or reach your end audience. And so we'll talk about that from both sides, and it's going to be a little bit different depending on, for example, whether you're approaching it from the perspective of television broadcast, or streaming video, or live events, corporate training. All of these things have their kind of own nuances, and I think we've built a pretty compelling line of products that actually cover almost all of these specialties. So Ai-Media is probably the only company active right now that really runs across the scale of the different tiers of live captioning that are prevalent really from a completely automatic form of live captioning with Lexi to the highest skilled level of human captures with the Ai premium service.

So we work with all three of those, and I think in this webinar, we're going to try to give you a good lens into some of the strengths and weaknesses and decision factors with these different types of services. It's worth noting that also with the EEG equipment, there is a very strong relationship with a lot of third party suppliers of human captioning services, and they're all providing services through the EEG equipment and trust EEG's networking and equipment to get their captions into their end customers use case. And that's going to continue. You know, there's a lot of great partners there. And you know, this webinar itself is for the first time in our EEG webinar series actually being captioned by one of Ai-Media's North American human captioners. I think she's doing a really outstanding job. So I mean, let's consider that certainly before anybody thinks that automatic captioning is really the only form of captioning of interest, that's certainly not true. When you look at these different tiers of captioning know, you'll see there's a few different categories out there, and the lowest tier of captioning is something that Ai-Media and EEG are not going to be directly involved in because it would typically mean uncustomized out of the box captions associated with a platform.

And that platform might be something like the built-in captions in a Google meeting or something like that. And basically, those captions will often have fairly good quality, especially if you speak clearly. But there can be a lot of issues that you would have with special terminology, any speakers who maybe has trouble understanding and there's not a lot you can do about what those services do. Each one does its own thing. It doesn't help you, really, when you need to broadcast to external audiences through a variety of platforms. It will be fixed to the platform and it really does what it does often for free. So there are obviously use cases where that's going to be enough as you move into third-party solutions offering a somewhat better experience. The EEG Lexi service is available as a self-serve, very easy-to-use automatic captioning service, and it has some opportunities to deliver to multiple platforms and to customize the caption results in a way that out-of-the-box products might not really give you any control.

Moving up from there, when you move into our smart Lexi product, Smart Lexi product gives you an ability to really work with Ai-Media's expert coordinators and curators, and they will build a model specifically for your style of content based on preparation material, either past videos, documents, websites, list of vocabulary and people. The information will be delivered by the customer. We kind of know how to ask the right questions to get out the best form of information you can get. And our team will work on these models and provide you with a full-service experience more similar to traditional captioning, but still leveraging a lot of the very fast turn-on reliability and somewhat cheaper cost of using an automated solution. Now finally, a lot of customers, especially large broadcasting, a lot of types of corporate presentation, are still going to be benefiting from the premium (steno and re-speaker services). And that is generally going to provide the customer with the highest quality, in some ways, at the lowest effort because how do these services work?

You do typically need to book them in advance. But on the other hand, you're working with experts who are going to show up, do the job and really do the job right and can barrel through, frankly, a lot of the types of adversity that automatic captioning will sometimes have problems with, whether that be clips with a lot of music, speakers who have stronger accents, people who speak over each other, people who mumble, all of these problems are essentially a lot easier to understand when you do have a human seeing the full context of the presentation and understanding them. So that's still a really valuable place in the ecosystem for customers to consider. When we look at Lexi, the key when you know, kind of thinking about Lexi and smart Lexi, is really that Lexi is a software or technology service. So Lexi offers a great automatic caption technology with the right conditions. You could be receiving 96% accuracy or more. But it's a technology. So essentially the customer is responsible for turning it on, turning it off.

Uploading the preparation material, making sure that doing QC to make sure it's running when it needs to be running and that a quality result is being achieved. A lot of that falls back on the user, which in some scenarios can be very, very convenient because it means that you're really able to take full control over the job. But that can also be a lot of management overhead. That's difficult for some companies especially when you're just getting started with captioning. So Smart Lexi offers both a set of management features that's going to help you actually get started with captioning and make sure that you know the scheduling is taken care of. Make sure that you have coordinators and that you're getting the help you need to make sure that the captioning is online, that it's on your video, that it's gotten all the way through to the end point and getting that kind of customer service and even technical support when it's needed, while also making sure that you're going to get the best accuracy that's possible from today's state of the art Ai captioning with some modeling that's specific to your application.

The second set of questions beyond where your basic transcript is coming from when you're doing captions is where is it going to. And that's the traditional domain, more of the EEG technology side of the business. We've done work for many, many years, originally in broadcasting, making hardware products that are closed captioning coders, and that's expanded in the last few years to software-based encoders that are virtualized and cloud-based. So we're going to return to broadcast solutions as a subset of what we have in a couple of sections. But right now, thinking more in terms of enterprise live events, corporate videos, the first solution we're going to look at is Falcon. And Falcon is perfect for when you have streaming video and you're looking to get consistent, high-quality captions that are going to appear on a range of platforms. And some of these platforms might have solutions for captions that are built-in or ad hoc. Some of them may not, and you have to ingest the captions externally.

But everything from really big players like Facebook, YouTube, and Twitch to smaller platforms that you might use a media platform, for example, all the captions are going to go in from Falcon to any of these because it works in a standardized way that the platforms and the players can ingest. So that's the great news about closed captions with Falcon. Now, because it works in a standardized way based on broadcast captions, however, you can have some challenges with different languages. And that's another thing we'll talk about a little bit more in this webinar. But essentially, you may have to use some of the other features of Falcon, like an open caption overlay to achieve what you need to do in a cross-platform way that's also multilingual and doesn't only support English and other European languages. Sometimes your event may not really be a video event at all, but it might be an in-person event with a speaker or other kinds of conferencing that don't really follow a one-to-many broadcast format.

And that's where our Ai-Live solution comes in. And Ai-Live provides an audio connector that you can use for audio coming from your computer, whether that's from a microphone, it could be even a higher quality microphone which would be recommended in the case of something like an organized lecture or in a classroom setting. But any audio device connected to your computer or device or else audio that's reflected from something like a web conference, you know, signing in to a Zoom or teams or anything like that. And Ai-Live will connect that audio stream to a remote source of captions, and that remote transcriber will be able to deliver text directly to the students or the users of the solution on their personal devices, or a device or computer that's been issued by the institution. You can put this on kind of a big screen, if needed, in the lecture hall, and you're going to see a scroll of captions that's going to take up the screen and going to be a little bit more visible, like a CART type solution that you can also look back in the history of.

So it provides a little bit of a different form of accessibility than the built in video captions. The video captions are really good for a mixed audience where you want to show one video and you want to broadcast it to a lot of different users. And some of them are going to want to see the captions. Some of them aren't going to want to see the captions. Some of them might want to see the captions in different languages from other audience members. And that's where closed captioning really shines. But CART captioning through Ai-Live or open captions can be good than providing a targeted link that really offers a consistent, controlled, high-quality viewing experience for just essentially one language per link. And if you had a need to publish multiple streams or multiple links for different languages, you could do that and it would simply need to explain with the audience where each of these was going and why. So let's dig into a little more this issue with multilingual because that's something where we've seen over the last year or two just tremendous growth in the number of live events and enterprise communications that are looking for a global multilingual captioning solution.

And it's something where the interoperability of systems has unfortunately been weak and in too many cases can force customers to settle for kind of less than their original goals with the presentation. And I think if you really want to look at the origins of that, it's simply because the live streaming standards have inherited so much from the live broadcasting standards that were really developed in North America and to some extent in the UK and Europe for closed captions and Teletext. And these were just never really intended to be global multilingual standards. And unfortunately, that legacy kind of still followed us into a lot of streaming platforms today that use the same standards that were originally developed in the early 1990s or even before. So the goal is going to be to get the content from any language to potentially any language. And to see the value from that, I think it's important to understand that a lot of our customers will start with a statement of a vision that kind of says my company operates in 16 countries around the world and the business language may be English, but essentially that doesn't mean that all of these speakers are actually most comfortable in English and I think we could get a real benefit from actually having a combination of the spoken text in English and some translated real time data that would help speakers of other languages natively understand what's happening.

And that's going to increase comprehension, and it's going to provide an easier way to kind of search and understand the transcripts for this data in the different systems that it goes into downstream of the live captioned event. So the potential benefits of this really are enormous, and depending on the mix of solutions that are used, the costs for incremental languages can be fairly low. So there's a lot of different event styles that we've worked in. And here's honestly just a sample of languages that in the past year Ai-Media has done events in and there's a range of resources available for different price tiers. Some language pairs to get a real-time interpreter and transcriber might be quite expensive when it's a premium event, you know, perhaps connecting diplomats from two countries, that can be an expense that's really easily justifiable. Other times, there is a desire to kind of carpet the entire viewer audience with something that we understand may not be completely perfect but it's going to offer a high-quality machine translation of the content and it's going to provide some basic context.

And even if every idiom isn't done perfectly, it's still going to offer a lot of value. We can even offer improved captioning for these product names, company slogans, anything else that we really want to have a specific idiomatic translation. We've built systems that allow us to actually do those substitutions and help get that right for clients. So the goal is really to go across a range of solutions and make sure that you can offer something in each language that you're going to be looking for. This is an example of what work we've done in EEG's Falcon doing some languages that are often difficult to support. We have a Japanese captions and Russian captions, and we're actually able to do these live and in real-time so long as you're working in the formats that will actually support that language. So kind of to break down that point a little bit more about the interoperability and some of the challenges there. We've made this little table, and what you can see here is that we offer for delivery a range of solutions in Falcon as well as the Ai-Live text platform that we saw before.

And the basic message here is that you see the solutions ranging from the embedded RTMP captions in Falcon which offers the best interoperability when you're operating in a supported language. And it's definitely going to get you captions to the broadest array of platforms if you're primarily looking at English or any other Latin character language. When you start moving into world language support, you'll want to use one or more of these other alternatives, since those languages won't be supported cleanly in embedded caption. So that can range from putting a text track in an HLS stream directly from Falcon. This is a solution that would be great if you can just put a player on the event web page and download our HLS stream. That's going to work well if you're not relying on a larger platform to provide all of the features for this video, like social and search and future VOD archive. And it can be used in combination, right. So you can have the main video for the supported languages on your main platform, and you can have additional links to have open captions or a range of HLS closed captions.

You know, kind of on a secondary player where that's needed, Ai-Live offers a similar solution, essentially. You can have multiple languages of links to that CART text-based transcript in Ai-Live, and that can be something that accompanies the main video program when all the captions can't necessarily ride along with it. So, we're going to move now to talking about broadcast captioning a little bit more. And that's the other side of the delivery coin when it comes to the EEG solutions. And in some cases, this can overlap too, because of course, today broadcasting is not just over the air and is not just the traditional channels, but could include a lot of streaming broadcasts and more professionalized live events that are going to have a lot of overlap with the types of solutions we'd use for broadcast television. So, there's definitely a range from doing captioning that might be in a single small classroom up to doing the major sporting events the Super Bowl or the Olympics, and you're going to see all of these things in between.

And of course, the lines between different types of media for our customers and different types of production are definitely blurring in many cases. And we're looking to support that with our customers who are doing uncompressed video and looking at things like moving to 4K and Ultra HD and HDR production. We're seeing a bit of a split right now between customers who are primarily looking at this as an IP transition, where the next generation of broadcast video products needs to be virtualized, needs to be focused on being in the cloud, and being a software-focused workflow. And some places are probably currently more localized in the AV space, where there is a need to have a sort of a contained, easily portable, more SDI style deployment still. But to get that higher rez because if you also look at a place where you would really benefit from something like 4K video when you're doing live events and you have a jumbo screen, that's where you're really going to see the benefit of that added resolution, probably more so than delivering streaming video to a laptop where you already have a lot of compression involved or even broadcast and cable, for that matter.

So, we'll be looking at solutions both in our SDI range, where we have a new 4K compatible encoder and also with Alta, which is our IP video product and that support standards that are in the MPEG transport family, SMPTE 2110, and CDI and is useful either on-prem and hardware, on-prem as a virtualized appliance or actually in Public cloud using something like AWS or Azure. So, our biggest announcement for Alta in this year is we're supporting AWS CDI, and that's something that we're expecting production support for by the end of the year. We've done a number of pilots and in drops with AWS and other vendors interested in this approach. And CDI is really interesting because it provides uncompressed video kind of heavily derived from what's been going on with SMPTE 2110. But it takes that and puts it in the public cloud because SMPTE 2110 has a reliance on multicast IP, it has a reliance on PTP. And the great thing about it is that it's really an open standard that a lot of different vendors have supported successfully with kind of commercial off-the-shelf switches and servers.

CDI has a bit more of a lock into AWS aspect, where it uses some proprietary technology in the AWS fabric adapters. But that being said so many broadcasters are looking to do more in AWS and looking to do one compressed workflow. And CDI makes that really easy and actually provides a pretty smooth lossless experience. And I think you're going to see a growing number of vendors supporting that when those are the important parameters to say, how can we move to public Cloud on this without really losing anything in terms of our resolution or our frame rate or our ability to do very low latency operations? And you even see a benefit to that in closed captioning because when you're doing Alta or Falcon with a compressed stream, there is some additional latency associated with the compression algorithms and how captioning fits into that. And with CDI, you're more similar to where you are in the SDI baseband domain, where captions can really go in on the very next frame once the transcription is available.

In the SDI domain, the new product this year is the AV650, and the 650 is our first product to support 12 gigabits per second SDI. So, that will carry UHD or HDR signals that in previous generations of equipment, you would have had to use what's called a quad-link where you have four different lower bandwidth SDI cables that are carrying parts of the same video picture. That's been a problem for things like caption encoding, where you simply need to access the vertical ancillary space in one quad of the video. So, now we have a 12 gigabit per second SDI product that can do all of that in a single box and strongly recommend it especially, as I said in the AV space where you'll actually receive the benefit of the native resolution on the built-in caption decoder. So, this is capable of doing both closed caption encoding for downstream decodes or for television. It's also capable of doing open captions and actually has all the features of the EEG AV610 product, which is our product for live events, which provides you with an ability to move around the captions, control the size of the captions, do things like put the captions over a still image overlay that is generated into the video in the SDI box.

So, that really helps with live presentations where you might want more control over the caption appearance than you would be able to achieve in the broadcast domain, where you essentially have to encode the captions according to a fairly narrow standard right in your broadcast closed caption encoder. And then the decoder is going to be in a consumer setup box way down the chain that's made by another brand and et cetera, et cetera. So, to get those really sharp font selections and things like that it can kind of make the difference in an AV presentation. Definitely the product for you. So, we'll take a quick look now at some of the things that are coming next between EEG and Ai-Media, we're going to be continuing to try to refine the value that we deliver to customers by looking at some of the issues that just we encounter day to day and understanding that as kind of a 360 business in captioning, translation, accessibility, we have a lot of power to help in ways that it's harder to do when less of the ecosystem is being addressed by the product.

So, one of the key things that we're working on and going to be rolling out to some customers over the rest of this year is an ability to kind of take the benefits that you might see from automatic captioning of being able to do an API driven or simple online booking for captions and to actually make that available to all quality tiers of captions. So, including Smart Lexi work and human captioning work and to make sure that there will be a booking site where clients of Ai-Media are going to be able to easily book new sessions. In some regions, this could be with as little as 24 hours' notice. And you can put all the information about your event in your need into a pretty simple set of web forms. You'll get a confirmation and the caption event will go on and you'll be billed through the billing information and relationship that you already have on file, or probably in the future, we're even going to allow direct payment through your credit card the way we do on EEG Cloud. So, imagine really all those systems being integrated where what you're doing with Falcon is actually completely compatible with all forms of captioning including human captioning, and you'll be able to really pay for that on one bill and book it all through the web and essentially your need to kind of have ad hoc interactions with your account representative who obviously is a really, really nice guy.

But if you want to just book this online or have this booked through a scheduling API that you already use in your production systems, that's going to be supported. And I think that's going to be a real lift for a lot of customers in terms of making captioning easier and making it more of a why not for a broad range of events. It's going to become a simpler calculation where there's less logistical struggle to do caption, and the only thing you'll have to consider is essentially the audience and the delivery style and the budget for the event and hopefully try to include caption and also translation as many places as possible. In translation, we're continuing with the multilingual program to really try to make sure that we are offering a complete best-in-class solution for some languages. That means understanding that the local providers of software and services in different languages, it really pays to use a local provider. And we're trying to onboard as many vendors as we can, as quickly as we can in a way that make sure that we both have enough capacity to fill all our customer needs and languages.

And also that we're able to evaluate different vendors and provide really the world's best in class quality for each of these languages or each of these language pairs. That's something that basically as a largely Anglophone company, there's been so much attention over the years to making sure we have really high-quality operators in English, making sure that we're testing everything that we're doing against any AR and other metrics to get great English performance. We need to be able to do that for all of the languages that we're going to offer. So, it's important to not just be offering a solution in Turkish or other languages, but really to be able to offer a great solution. So, there's a lot of focus going into that and also to be helping customers with the delivery aspect of that. Because when we talked about how would you get your customers to view, for example, multiple open captions streams in different languages? It may be that that's hard to do. You might be thinking, Well, maybe I'm going to have to just put 18 different links in the comments section of my main video and kind of hope that doesn't get buried, hope that we can pin it to the top and someone sees it.

We're looking to kind of help with that by giving our customers a global accessibility playback portal. And so essentially with a single link, you can have access to all of the different streams of your video that might be needed to actually get across all of your requirements, all of your languages, any style requirements that are specialized, and to have an easy way to direct viewers to what they need, even when they have trouble delivering it on your main platform. I think that's really the ticket there to try and build something that's going to help you get everyone to the videos they need and have that be as smooth as possible a part of the experience. Finally, in the Alta space, we are seeing massive adoption of all to into the cloud space, including on the honestly on the types of networks where a couple of years ago it wouldn't really have made sense to attempt to be sharing high-quality video. So, we're in the midst of a process to add a lot of different, well, three different, but a lot of the commonly used forward error correction and encryption algorithms that are used for transport of high-quality video across different places in the cloud, including potentially remote production that could be occurring from, for example, a sports venue or studio that's even in another country or continent and make sure that your stream arrives fully intact.

It's something that not all of the standards were originally built to support, but there's a lot of great protocols today that will make us interoperable with some of the best video and coders and other systems out there. And so we're going to be beefing up Alta in that area and making sure that it's definitely a bulletproof solution up there in the cloud. So, I hope the webinar has given you a good flavor of how EEG and Ai-Media are seeing our business today and seeing our relationship with your customers and kind of how we can use the global scope and the increased 360 all-around scale of providing services, providing video technology and encoding products, providing both automatic and human tiers of transcription and translation. And kind of how that can really drive some value for you, the customers, and look forward to hearing any further ideas and feedback on that. At NAB or something we get to really communicate in both directions, and that's really a lot of fun. So, I hope to see everyone soon.

I know that Regina, I think, is going to guide us through a little bit of Q&A if anyone wants to stick around for the next few minutes and I'll try to help with any questions you have.

Regina Vilenskaya: Thank you, Bill. So, we are now at the Q&A portion of this event. So, if you have any questions and have not done so already, please enter them in the Q&A tool at the bottom of your Zoom window. So, the first question I have is, are any of these products compatible with audio description?

Bill McLaughlin: Yeah, so we are currently offering our services in audio description in the recorded space, which was not a space that we spent a lot of time discussing today. But basically, for anybody not familiar with the terminology, essentially like you have VOD clips or you have just a video archive or library. And the question is, here is the video, what can I do to add captions to get transcripts in multiple languages or to add audio description? And that would typically be an additional track that was associated with the video that described in a separate audio track. Kind of non-verbal information that was in the video and kind of so describes any sections where there is anything from music to action sequences to anything that's going on that's important to understand the context of the video but you wouldn't get from listening alone. So, that involves experts that are trained in audio description, kind of writing a script, actually performing the voiceover, and getting that into the clip as an extra audio track.

And yeah, it is a service that we offer in the VOD or clips domain. It's not a service that we're currently offering in the live domain. And obviously, it is something that in some genres of live video could make sense. But most broadcasters are not really working on a live requirement right now, but it's being done in a lot of places in the world for recording material.

Regina Vilenskaya: What engine or product is driving the captions via Zoom?

Bill McLaughlin: In today's webinar, all the captions are being done by one of our human service providers from North America. She's doing a fabulous job as I was able to see it a little bit during the opening. Now, I've needed to focus more on reading my slides. But yeah, so we've been feeding that. It may be fed through EEG's Falcon system. It also you many of the caption writers have software that's able to feed Zoom kind of directly from their captioning. So, that will depend a bit on the individual workflow if you use automatic captions or Smart Lexi that gets fed through our Falcon product through its HTTP connector. So, Zoom will provide a link that either, third-party software captioners can actually send the captions to through Zoom and that can be used through our HTTP Falcon product if you're for any external sources of captioning or any third party transcribers out there looking for a solution.

Regina Vilenskaya: What are your predictions about the evolution of AI captioning?

Bill McLaughlin: Yeah, that's a big question. I think that what we see is there's a lot of really healthy competition out there in the space for building new engines and building new technologies that improve recognition for accents. There's some key problems out there that frankly, we tend to operate primarily in the commercialization space of some of this compared to doing really basic R&D that like university research and some of these things. But some of the problems that are really interesting that are out there involve things like how quickly you can detect a speaker change in the AI captions in real-time. Things like what's called code-switching, where you might have different languages being spoken in a continuous video stream and understanding when you need to switch languages for recognition. Understanding things like sound effects and music better. So, I think, there's kind of two questions and one involves kind of your basic accuracy of speech and the other can involve all these other issues that are actually very important to captioning that maybe are less important to some of the other popular applications of AI and speech.

It might be less important when you're kind of doing voice command work or if you're doing a call center or other kind of business process transcription. So, there's definitely a lot of work. I do think the technology is going to continue to improve. I think that we're going to see kind of a growth in acceptance of automatic captioning in a lot of spaces. We're going to see more and more automatic captioning all the time in the products that are around us, in our remote meetings, and all of these kinds of things. I do think that there is still a ways to go for the AI captioning to really pass what you might call that Turing Test of kind of everyone thinks it's the same or even better than the most skilled human operator. I do think that, that John Henry kind of moment you could call it is a while away.

Regina Vilenskaya: Do we support AI or machine learning-based captions for Arabic? And if so, what level of accuracy can we expect?

Bill McLaughlin: We do have a support for AI captioning and machine translation into Arabic. I couldn't quote you a precise accuracy number on that right now, I'd encourage you to reach out and maybe we can share some samples or data. I also know that that's something that we've had a kind of some pretty active research into looking at different suppliers on. I know that one of the issues I'm not an Arabic speaker myself, but I mean, one of the issues is that there are a number of different dialects and it is not uncommon in a lot of types of media to have kind of different dialects, speakers kind of in and out of the media. And so, therefore, we've been very interested in some work that involves questions like is it possible to kind of switch smoothly between speakers who could be from different countries or areas and continue getting a quality transcription without having to assume that we're going to use a certain model for a certain country? And that's it. So, yeah, definitely an interesting subject.

Regina Vilenskaya: Is Ai-Live being used in the classroom environment? And if so, do new links have to be generated for each class session or is there a link that can be bookmarked and used for each class session?

Bill McLaughlin: Yeah. So, Ai-Live is definitely pretty widely used in a classroom environment, I mean, that's possibly the application that this was originally developed for. I know it's from the Ai-Media side of the company but has been around for quite a few years. The default on Ai-Live is that you're going to actually see that each section is going to be a private link with private settings. There is a feature that's available where you can have a persistent link generated, and the persistent link is kind of it comes from your client identity, and your client identity can be tied to an ongoing set of links that's going to occur at different times. So, for example, if you potentially needed multiple streams at once sometimes, you might have to have different persistent link identities set up for those different streams. But for one recurring session or set of lectures, yeah, you should be able to bookmark a persistent link.

Regina Vilenskaya: Is there any chance to make Microsoft Teams have external captions, whether with EEG or Ai-Media products?

Bill McLaughlin: Yes. So, actually, there was just a Microsoft announcement about this. I believe this went public sometime like last week, and Ai-Media had been one of the main testers for this. But yeah, Teams is putting out a kind of a Zoom-like feature that's going to allow you to send third-party captions to Teams. We are prepared to do that. I believe as a teams user, you currently need to kind of ask to join this as a pilot user. I don't think it's kind of in just general accessibility across everyone using Teams yet. And but yeah, if you look for the Microsoft link on that, there has been a new announcement on that. And yes, we're prepared to deliver captions on Teams now.

Regina Vilenskaya: And it seems we have time for two more questions. So, do we have human operators on other languages than English?

Bill McLaughlin: Yes, both for captioning, which essentially audio and text being in the same language, so transcription or captioning, and also for interpretation. If James was here, he might be able to give you a more complete list of exactly which languages we're providing human versus machine translation services on. But I know that there are some languages that are provided in-house, some that are provided through vendors that we have a relationship with. So, I mean, we can subcontract it, but help the client out. And yeah, I mean, I would encourage you to get in touch with any specific language pair questions, and we can kind of tell you what kind of ranges of services we can offer.

Regina Vilenskaya: And the last question. Are there any new requirements for captioning emerging or do soon, which will require captioning for live or on-demand videos?

Bill McLaughlin: To answer the question with the global scope, I'd probably have to say yes, there are all these new regulations emerging. However, I'm trying to think of what the most pertinent news on that would be to kind of most of our viewers. I mean, from a US-centric viewpoint, I would say that there hasn't been a kind of new FCC-type movement on that that I'm aware of super recently. I know that there has definitely been kind of a continued drumbeat of kind of extension of Americans with Disability Act's protection to streaming video. And right streaming video is only regulated by the US FCC, which is the broadcasting regulator if the same video is also being simulcast or previously broadcast. So, essentially, the streaming video rules kind of piggyback off of the broadcasting rules. And of course, that's become less relevant as more and more content is kind of streaming original in professional media or just there's more and more kind of enterprise style content that's just not it's not media and entertainment.

It's not going to be applied to that kind of standard. But there is definitely a kind of a growing body of case rulings that kind of say that in a pretty wide array of applications that you have an obligation to provide accessibility on streaming video in the form of live captions. So, that's something where the specific application advice for your situation and your audience size and obviously where you do business can make a big difference, but it's definitely something to be aware of.

Regina Vilenskaya: Thank you. So, we have reached the end of today's webinar. Thank you to everybody who could join us today. And if you have any questions that you would like answered, please feel free to reach out to sales@eegent.com or sales@ai-media.tv. And thank you all again for joining us and take care.

Bill McLaughlin: Thanks.

October 25, 2021

Webinar Replay: What’s New in Captioning

EEG

Transcript

News

Introducing LEXI Viewer: Revolutionzing Live Events with Cutting-Edge Captioning Solution

Ai-Media Unveils AI-Driven LEXI 3.0: The Future of Live Automatic Captionin

Ai-Media Launches Live Subtitling with GB News in UK Broadcast First

Find Us