Shaking off the dust from what could be described as the longest year known to man — remote work is a hot topic in the world of employment. By establishing both its benefits, as well as its challenges, remote work has people talking about its permanence. What is more, employees have become accustomed to remote working, in fact, many of them actually prefer it to the office. According to a FlexJobs survey, 65% of employee respondents reported wanting to be full-time remote post-pandemic, and 31% want a hybrid remote work environment — that’s 96% who desire some form of remote work.
These numbers inevitably mean that the methods in which we worked during the pandemic, primarily via the screen and through video calls, will have some longevity.
In the past year, there has been a daunting amount of “incidental” or unintentional content creation via the many different digital platforms we now operate on. With massive amounts of data, however, there are large sums of insight to be had.
With the right tools, your business can work smarter, not harder, and have this valuable knowledge extracted from content derived from workers’ day-to-day interactions. This acumen can be the competitive edge your business needs as we move forward with our technological workday—with AI in video enhancing many factors of the modern age work world.
How AI Has Changed the Video Streaming Experience
With the growing amounts of online meetings and content creation happening in 2021, the key to video streaming in the modern work world is navigating it. Video has the potential to bring content to life, but more importantly, it gives the ability to access what’s in the video in an intuitive and efficient way.
Let’s look at it this way, would you buy a textbook if it had no table of contents, index, or chapters? Of course, you wouldn’t. It would be crazy to have to just find your way through pages of unstructured text, but that’s what we do for video.
By implementing AI into video, you have the ability to customize and easily access all of the contexts that exist in the video’s contents.
Through Machine Learning (ML) and Natural Language Processing (NLP), AI can do all of the hard work of deriving data for you—helping to mitigate your search time and any fatigue that might come along with it. Through audio and visual data, AI takes all of the available understanding from the video and tags content by keywords, concepts, and important and relevant topics.
The ML and NLP then construct a transcript, and from there, the AI creates an intuitive index—creating transcriptions, chapters, and chapter titles, and finally a table of contents. This makes searching for content easier and more efficient for each user.
Video now, significantly differs from video in the past.
To date, when it has come to utilizing the power of video, most of the time it has been done in a highly meticulous manner. Rather than manually tagging video media with editor tool applications—crafting tags one-by-one or creating a time-sliced video by tagging minute intervals—AI can do the work for you.
One label or title, or a tag at “minute six” is pretty much meaningless when it comes to searching because the keyword is limited to the interpretation of the publisher.
When you are looking for anything—whether that be in the grocery store or on Google’s search engine—you most often have something specific in mind. AI allows for a new variation of video tagging with the capability to draw relevance to a plethora of topics and keywords. This enhances both the approachability and scope of video organization and use. AI saves companies staff, time, and resources to apply these methods to their existing bank of video content.
OCR and How it is Changing Video Conferencing
An emerging video technology, Optical Character Recognition (OCR), can now read the still-snapshots in your video and determine if any relevant text can be drawn out. This can be used on things such as PowerPoint presentations in the background or words written on a whiteboard behind the speaker in a video. By combining both the audio recording and the textual elements derived from OCR, AI can gain more content than ever before and create an all-encompassing transcript of the video.
This AI-driven translation gives the capacity for “media contextualization” — which simply means the ability to look inside a video and draw out all of the pertinent information that is needed at that moment.
This process is made possible by NLP and ML, which combine their capabilities to create a knowledge base to which all relevant information is centralized. The deep learning process can then come in to analyze and organize all of the text into a systemized database, understanding when the context has changed.
Then, the AI-driven technology knows when to create a new “chapter” accordingly — outputting a phrase or blurb that is the best-suited title for that segment of the content. From there, an entire table of contents is manifested for each video recording, with all of the information accurately and efficiently organized.
By allowing the AI to go further and make sense of all the data points demonstrated in a video, the technology can help to propagate relevant information across departments within a company.
This is important, especially with the uptick in conferences recorded in the modern work world. There are miles and miles of potential insights to be had within a business’ recorded video calls, but a need to make sense of it in a framework that is relevant to the individual.
AI-driven technology, with OCR implemented, helps to make contextual connections throughout all of the information and create a user-friendly structure. This makes for a much more intuitive video-user experience and allows people to find and share exactly what they are looking for.
Connecting Context Through Ontology and DBpedia
With a well-organized, informed, and centralized base of knowledge, AI innovates further through what is known as ontology. Ontology is a set of concepts and categories in a subject area or domain that shows common properties and their relations.
One of my clients, a company called Ziotag, uses proprietary AI ontology technology to create tags within video media. This is initiated by first allocating all of the different terms that people might talk about on a certain topic.
With this insight, the AI can do its magic and create ontology tags to procure more than 50,000 concepts—finding the ways to which they can all relate to each other.
This creates a multi-faceted and dynamic foundation of data points that could almost represent a human brain—using concepts, keywords, and context-understanding to deduce what a user might be looking for in the knowledge base.
When this local apprehension is applied to the bigger picture of the internet, the possibilities are endless. A project by Wikipedia known as DBpedia extracts structured information from 111 different language editions of Wikipedia to elicit knowledge using Semantic Web and Linked Data technologies.
The largest DBpedia knowledge base exists in English and consists of over 400 million facts that describe 3.7 million things — just to give you a bit of scale. These mappings were created via a worldwide crowd-sourcing effort in the hopes of enabling knowledge from all of the different Wikipedia editions to combine and create context from data.
Ziotag’s ontology approach mirrors this data connection strategy, helping the AI to discern concepts from a variety of resources.
By comprehending context from a vast amount of knowledge, AI can transform video, giving immeasurable insight to those that use it.
These insights can be seen when searching words with similar names but very different meanings. To toss out a simple example, look at the word ‘salt.’ When you searched that word, were you looking for the scientific compound sodium chloride or table salt? Or maybe you were looking for the local restaurant titled as such or the history of mining for it?
Ontology AI technology can distinguish what you were looking for through linking meaning vectors, thereby tailoring the organization of concepts in video to your individual needs.
Automating Business Processes
The combination of all of these innovations in AI can change the processes to which modern-day companies can perform digital operations, drastically increasing both efficiency and interconnectivity.
By extracting MetaData from employee interactions and team meetings, the AI-driven technology can make expertise visible across departments.
This makes vital information that employees need readily available, traversing any silos that might exist in the business infrastructure. Once knowledge is centralized, AI can be proactive and create delightfully personalized experiences.
Gleaning context from employee’s interactions both on video or through different applications such as Slack or email, AI can start to create a dialogue for personal workflows.
In large companies with a distributed workforce, this can be especially pivotal — integrating the different communications channels and compiling all of the exchanged information like a diligent librarian. Furthermore, the AI can also gain further cognizance with access to data on the person’s role in a company or their daily workload.
This could, for example, help individuals get up to speed quickly if they could not attend a meeting or were out for a while due to vacation. What is the best part is that the information can be absorbed in a completely customizable manner.
This information can be translated back in a way that is most convenient to the worker — whether that be expanding chapters, searching subjects with ease, or reading and listening at their pleasure.
Automating processes will revolutionize how AI is implemented into business, changing the digital workplace to be more efficient and less disparate. This by proxy will inform a new way of working across many industries. Computer science and AI is a massive structure, and its foundation was built on a vast array of developments.
When the higher goal is focused on, many pertinent innovations can be achieved.
The next generation of AI will be able to do even more, with each floor of the computer science structure defined by the level below and no known limit to its potential height. AI in video will be a multiplier in the construction of this edifice, propelling the remote work world into the future.
By: Emily Senkosky, in collaboration with Graham Morehead at Ziotag Inc., a New York-based technology company that uses Artificial Intelligence (AI) to make searching and navigating video and audio content a seamless experience.