1 of 28

Facecomm

Made with ❤️in India 🇮🇳

What is Facecomm?

Facecomm is Made in India 🇮🇳 Video calling and conferencing tool, keeping user privacy and security on top. A product and platform to provide end-to-end social and enterprise communication solution. Facecomm is build for Indian user base, keeping the diversity and challenges in mind.

Who are we?

We are bunch of passionate experienced folks from different domains came together to build Made in India social and enterprise communication platform for everyone.

Why choose Facecomm?

Privacy & Security 🛡️

Privacy and security are inbuilt in our core product ideology, and everything in the product is around safety and security. Most of the product think about privacy and safety at the later stage, but our product is built on privacy & safety-first approach.

On-Premise Solution 🏢

Companies that have extra sensitive information; such as government and banking industries; for them, we offer on-premise solutions for our products so they can have a certain level of security and privacy.

Adaptive Communication 📡

Adaptivity of communication will be inbuilt in our product through design and technology. The product will be able to adjust the behaviour based on the user bandwidth connection and connectivity. Suppose a user is running the application on an extremely internet connection; our product intelligence will recognise it and adapt the interface and modes to provide seamless communication experience.

Unified communication interface 🧩

Our product is a unified communication interface which can be integrated with any external system and complement the core functionality of the product.

Vernacular 🔀

With the help of Artificial Intelligence and Machine Learning Technology, our product will support most of the Indian regional languages. This will give us a unique position which will help all kind of speaking language people to be a part of the communication.

Accessibility ✅

As the core of the product is communication, which enables us to provide a high level of accessibility to everyone. People with sensory disability who are not able to speak, not able to hear or not able to see will be able to part of any communication on our platform.

Communication Inclusion 📞

As our Vision is to make remote communication simple, secure and unified; we aim to create communication inclusion for all as anyone can be part of remote communication through our product with or without internet.

Simplified User Experience 😍

Our product will provide very simplified and humanised User experience to all kind of users. This will help first-time users to be on board and also perform communication & collaboration in a more user-friendly and efficient way.

Developer API ⚙️

We will provide our developer API for all to build any product, innovation or technology by using our developer API. This will open a revenue stream for us as well as open the doors for more creative ideas and innovation.

What are the features you provide?

Facecomm comprise of exciting features ranging from basic communication to complex enterprise communication and collaboration. Few of the product features listed below;

🎥 Video Call 🔊 Voice Call 💬 Chat 📃 File Sharing 🖥️ Screen Sharing 🎞️ Audio/Video Call Recording 🈁 Close Captioning (Multilingual) 🔄 Speech to Text Transcript 📋 Meeting Notes 📄 Automated MOM 🚪 Open/Closed Rooms ⏰ Schedule Meeting 📆 Calendar Integration 📋 There is a long list of features

Business Docs

About Us

These are pillars of our business, which will create a base for everything that we do and keep us motivated throughout the journey.

Our Vision:

To make remote communication simpler (User Experience), secure (Technology) and unified (Innovative). Simpler - Keep improving the experience. Secure - Keep building more secure technologies. Unified - Keep innovating different ways to unify the communication.

Our Mission:

To build the world's most secure, diverse and delightful communication systems. Secure - There should not be any loop hole in the security. (Technology) Diverse - Product and technology should cater diverse set of users and scenarios. (Innovation) Delightful - We will always delight our users in all his actions throughout the journeys. (Experience)

Our goals:

Become a market leader in remote communication. Tap existing and other un-tappable segments of the market.

Our Values:

Humans at centre of everything - Human should centre of everything, humanise things as we are humans. Customer first approach - Listen to the customer and put customers above everything else. Ethical in all ways - Follow ethics in everything. Transparent in nature - Be transparent with everyone in everything. Data driven - Data is the reflection of whatever we do. Analyse, Empathise, Improvise.

Our Team

Here is the passionate team who is making things happen.

Backend

Rajeev Singh

Cyber Security Advisor

Data Science/ AI / ML

Frontend

Apps

Product & Design

Rajeev Singh

Lead Product Engineer @ GOJEK, ex-Directi, Yatra, Infosys

Full stack developer with experience in building intelligent & scalable applications using Java/Spring, Golang, Javascript/Node.js, React, and Python/Django.

I love distributed systems and the challenges associated with them. I like writing, teaching and mentoring other engineers.

Amit Rai

Product Engineer @ GOJEK, ex-Yatra, Infosys

Backend developer, functional programming enthusiast, experienced in building scalable softwares that solves real world problem. Experienced in Golang, Scala, Haskell, Java/Spring, Javascript/Node.js, and React. I am an haemophiliac and take great pleasure in learning and creating softwares that has an impact on individual.

Sourav Sen Gupta

Cyber Security & Cryptology Expert - Lecturer @ Nanyang Technological University, Visiting Lecturer -Indian Statistical Institute

I am a Researcher and Teacher, positioned precariously at the confluence of Computer Science, Mathematics, and Engineering. While my research interest revolves around the "science of security" (cryptology, cybersecurity), I love to teach the "art of analytics" (data science, machine learning).

I am passionate about expressing the hardest of topics in the simplest possible ways, to students and practitioners. My current research, teaching and consultancy interests dwell in the exciting domain of Cryptocurrencies, Blockchain Technology, as well as in applications of Machine Learning in Security.

Shreyas Mangalgi

Data Scientist @ Myntra | PGDBA IIM Calcutta | Electrical Engineering IIT Bombay

I am a Machine learning enthusiast who loves to build AI products for real life use-cases. My areas of research lie in Deep learning for Natural Language Processing and Computer Vision.

Parminder Singh

Senior Web UI Engineer @ Revolut, London, ex-Swiggy, Flipkart, Tapzo

Hello, I'm Param Singh, Javascript Developer based out of Bangalore, India. I like creating awesome User Interfaces making out the most of the modern web trends. Being a JavaScript zealot, I like exploring new frameworks an libraries in the Web Development mundane. Besides this, I love to listen EDM music, trying out contemporary cuisines and cuddling dogs.

Renish Bhimani

Product Manager @ Automation Anywhere, ex - Apttus, LocalOye, Allscripts

Tec-savvy and versatile business analyst and solution architect offering 12+ years of success leading all phases of diverse technology projects; Degrees in Computer Application; and years of computer programming experience in health care, RPA and middle office industries.

Business strategist; plan and manage projects aligning business goals with technology solutions to drive process improvements, competitive advantage and bottom-line gains.

Software Development Management: experience enough to manage the group of talent in Mobile, Business Intelligence, RPA, Web, CRM

Vindhya Chandrasekharan

Product Consultant, ex - LOCO, Directi, ITC

Lead Live and Interactive games unit at Loco as Product Manager. It was the fastest downloaded app ever on the Google Playstore. Within one year we went from 100K users to 19Mn+ growth. Out of which a significant % of users play Live games! Have also collaborated with Brands like Myntra, OnePlus to run sponsored games that have been very successful for their marketing purposes

Nitin Rana

Senior UX Designer @ Myntra, ex - Directi, Tapzo, Localoye

“Focus on the user, and all else will follow."

I'm an Interaction designer, who loves to solve problems and make products meaningful. My product and design philosophy is to make product Usable, Feasible and Scalable. I have a proven track record of working on complex and user-centric mobile and web products.

I work closely with team of product managers, developers and end users to make user-centric and data-driven decisions.

Understanding Communication

Some brainstorming done to understand the landscape of communication and the scope of opportunity.

What is Communication?

Communication is simply the act of transferring information from one place, person or group to another.

Every communication involves (at least) one sender, a message and a recipient. This may sound simple, but communication is actually a very complex subject. The transmission of the message from sender to recipient can be affected by a huge range of things. These include our emotions, the cultural situation, the medium used to communicate, and even our location.

How humans evolved with communication?

Human communication has evolved over a period of time. Initial interaction used to happen through facial expressions and gestures; then humans started talking to each other and later they started drawing and finally developed the languages. Since then human development didn’t stop and keep on developing. Communication is one of the primary aspects which helped us building relationships and creating great civilisations.

Modern Communication

Now with the advancement of technology, humans have learned to use different modes of communication, to accomplish day to day tasks in efficient manners and keep increasing the efficiency day by day.

Modes of communication

There are several different ways we share information with one another. For example, you might use verbal communication when sharing a presentation with a group.

Verbal 💬

Verbal communication is the use of language to transfer information through speaking or sign language. It is one of the most common types, often used during presentations, video conferences and phone calls, meetings and one-on-one conversations. Verbal communication is important because it is efficient. It can be helpful to support verbal communication with both nonverbal and written communication.

Nonverbal 👌

Nonverbal communication is the transfer of information through the use of body language including eye contact, facial expressions, gestures and more. Verbal communication is the use of language to transfer information through written text, speaking or sign language.

Nonverbal communication is important because it gives us valuable information about a situation including how a person might be feeling, how someone receives information and how to approach a person or group of people.

There are several types of nonverbal communications including;

Body language
Gestures
Facial expressions

Written ✍️

Written communication is the act of writing, typing or printing symbols like letters and numbers to convey information. It is helpful because it provides a record of information for reference. Writing is commonly used to share information through books, pamphlets, blogs, letters, memos and more. Emails and chats are a common form of written communication in the workplace.

Visual 🎨

Visual communication is the act of using photographs, art, drawings, sketches, charts and graphs to convey information. Visuals are often used as an aid during presentations to provide helpful context alongside written and/or verbal communication. Because people have different learning styles, visual communication might be more helpful for some to consume ideas and information.

Ways of communication

Direct communication 🗣️

Direct communication is the natural physical communication between individuals and groups. From the origin of human race, direct communication is the key to build communities and relationships.

Remote communication 📱

Remote communication allows people from the different locations to communicate and collaborate together. They use many tools and mediums like email, chat, and online collaboration tools to facilitate the communication. Remote communication is an area of the science that deals with the data transferring between the devices not located at the same place.

Visual mode - Video calling
Verbal mode - Voice calling
Written mode - Messaging

Type of communication

Non - Professional (Personal) 😎

Personal remote communication could be easily managed through many available free or paid applications. Again any mode of communication for personal use could be used like, simple voice calls, Whatsapp or Facebook messages or by simple video calls as well.

Professional 👨‍💻

If we talk about professional communication, there are multiple factors that need to be considered even in direct communication. But if we think about remote communication in any kind of organisation, there is a drastic change in how we communicate, based on the demography, nature of the organisation, geographical conditions, scale & impact of the organisation and the quantum of the challenge is even higher.

Irrespective of any kind of organisation there are different nature of communication that exists in any organisation. I have divided these into categories and further described in detail of each.

Nature of communication

Social 👩‍👩‍👧‍👧

In organisations, there are communications which are very social in nature, people from different teams come together and communicate with each other. Many this social communication goes out of the organisation and across multiple organisation within the bigger umbrella. These communications sometimes have different ways to communicate like informative, explanatory, collaborative, open ended, restrictive etc.

Private 👬

private communication is restricted communication, or any communication made under circumstances creating a reasonable expectation of privacy. This form of communication can reach its intended receiver in a private space like, group, room, channel,1-1 etc.

Confidential 🤝

Information exchanged between two people who have a relationship in which private communications are protected by law, and intend that the information be kept in confidence. The law recognises certain parties whose communications will be considered confidential and protected, including spouses, doctor and patient, attorney and client, and priest and confessor. The intention that the communication be confidential is critical.

Market Study

Google Duo

USPs of Google’s approach

Optimized for low-bandwidth
End-to-end encryption by default
Based on phone numbers allowing users to call someone from their contact list

Cons of Google Duo

Maximum 12 participants allowed currently.

Technologies used by Google Duo

WebRTC
QUIC over UDP
WaveNetEQ (For packet loss concealment), a generative model based on DeepMind/Google AI’s WaveRNN

Facebook Messenger

USPs of Facebook messenger

Fun masks, effects, and filters on top of the video
User limit of 50
Facebook Live with Co-broadcasting

Technologies used by Facebook Messenger

WebRTC
MQTT for Signalling

Microsoft Teams

Microsoft Teams, both for regular browsers and their Teams App relies on WebRTC for audio and video communications.

References

Business Use Cases

Many covered/uncovered sectors and use cases exist where our product solution can bridge the gap of communication. Few of the potential industry and business use cases are described below;

Education

Education sector is one of the biggest sectors which get impacted by the pandemic of COVID 19. Our solution can provide a sustainable long term solution to the education sector ranging from school to colleges to higher education institutions, by remote connectivity.

Remotely attending classes, lectures, sharing study materials, scheduling time tables, creating notes, and cohesive communication & collaboration can get benefits through our product.

In the longer run, it could provide a solution for conducting remote examination and publishing results online.

Public Health

Our solution can provide seamless communication and interchange of information between patients and doctors, doctors to doctors and connect small health organisations to bigger health institutions.

A patient could directly connect with the doctor by sitting at home, and doctors can see the symptoms and prescribe the medicines remotely.

Doctors and expertise can exchange information without any barriers and save millions of lives. In tough situations, medical experts can examine the critical patients remotely and help other medical practitioners to perform any minor or major operations remotely.

Media & Entertainment

The entertainment industry got poorly impacted by the pandemic of Corona, where performers, artists and creative personas can not physically perform in front of an audience or in theatres.

Our platform could help these performers connect with millions of the audience remotely through the broadcast feature. Individuals can connect with the artist on our platform. Performers can come together and perform remotely either to create the revenue or to generate donations.

Banking & Finance

Banking is one of the most affected sectors by the pandemic. In banking, many of the activities and verification procedures happen on the ground. Many financial institutions are already using technology to reduce physical interaction and friction.

Our product could help banks to connect users more efficiently without going on the ground. Banks can do remote verifications of users for their financial services and provide loan and financial help to the needful. Our product could also help financial institutions to create better customer support and experience.

Public Security and Safety

Security is inbuilt and the core of our product; government, can use our product for public security and safety. A scenario could be where our system can help monitor and identify public activity in high alert areas For surveillance. Similarly, our product could be used in collaboration with governments to monitor suspected areas and track criminal activities to avoid worst-case scenarios.

Agriculture

The agriculture sector is another big sector which could leverage our platform to connect with others remotely. Farmers can directly approach to the local authorities for any help, to get informed about new government policies and announcements.

Our Prime Minister and maybe ministers could be remotely connected to individual farmers through broadcasting.

In villages, our product could organise remote panchayats, where local authorities and government can remotely connect with panchayats and villages.

Judiciary & Legal

Judiciary is another sector which gets impacted by the pandemic, where standard judicial procedure got disturbed as public gathering and movement is prohibited. Our product could be used in our judicial system to Fast Track the judicial process in the remote conditions as well.

Manufacturing

To manage global operations efficiently, manufacturing companies need real-time access to their worldwide resource and supplier base. Our video conferencing systems can enable real-time virtual product reviews and product development status meetings with offshore plants over video, for a dramatic time savings over shipping products for in-person feedback. From procurement to engineering to final product review, our system can improves manufacturing efficiency every step of the way.

Recruitment

With remote working becoming the new normal, companies want to hire global talent from other parts of the world. A video conferencing solution having integration with recruitment products like Lever, Recruiterbox, etc can go a long way in shaping the way companies hire talent.

Professional Services

With remote-first world, the gap between work & educational/ professional services will thin out. We are prepared for this and we will add features for services outside a technology first work environment. Services like Learning language classes, Musical Instruments, CA Services, Yoga, Zumba, Home Exercise can all be hosted within our application. We will add voting, attendance, reviews/ratings and other ancillary features to support this.

Marketing

As you scale a remote friendly organisation, all social events can easily reach to a global audience. You can take advantage of Webinars and Live streaming features of our product to broadcast communications and special events. You can promote new launches, news, and activities. You can provide executive updates and do remote contractor and vendor meetings over video conferencing.

Sales

Our product can enhance the way Sales professionals do remote customer meetings, partner conferences, business reviews, sales pitch, etc. Our Meeting Minutes of the Meeting feature can be used to document customer meetings, and customer interviews. We can integrate with Sales CRM products like Salesforce, FreshWorks, and others to provide a seamless experience for Sales professionals. We also plan to deeply enhance calendar to work for sales professionals that travel, within our tool so they are always at their best game.

State & Local Government

Remote governance is the future. State and Local government and various government bodies like Panchayat, Municipality, City council, Wards, and different ministries can work together remotely using video conferencing. The conversations can be recorded for future purposes. Our Minutes of the Meeting feature and Closed Captioning feature can bring together representatives from different languages and states to solve common problems.

One of our aims is to be human centeric, which means that we want to showcase the fun that happens in various social settings. A level of social engagement is what we want to provide by leveraging Artificial Intelligence and Augmented reality. Similar to Snapchat and other fun social engagement apps.

Events & Webinars

Social being is at the center of every experience we share, so we aim to take online experience just like it is offline. We want people to host big and small events right from their chairs, with us leading forward. You can organise small and large scale remote events, social gatherings, watch parties, discussions, and much more to engage people, and build communities.

Customer Support

We aim to bring in our automatic voice and video responses to help users address their customers. Voice/video will feel more connected & if we can with our AI/ML efforts make it seem more human, we think we can make a real difference. Chat and video product could be integrated with customer support systems to provide better customer support experience.

Home Security

We aim to go above and beyond just being a conferencing tool. With our expertise in real time video processing, we will give timely updates to all users regarding security of their homes. We want to upgrade and re-imagine, the clunky camera setup and the delay of stream. Our product could be integrated with Home security systems to perform monitoring and surveillance from any remote place.

Product Positioning

Product positioning is a form of marketing that presents the benefits of your product to a particular target audience. Through market research and focus groups, marketers can determine which audience to target based on favourable responses to the product.

Current Positioning

We positioned ourself initially to somewhere at SME's and large Orgs with expectation to have medium adoption in market. It's fairly a large market. There is enough space for new players to come and capture some industry share. This could be a sweet spot to start with in the wide canvas of communication.

Future Prospect (Vertical)

Vertically we will be try to go deep into the larger organisations and try to have high adoption with large market share. Here our focus will be more towards solving complex large org problems. Efficient communication and collaboration and increased productivity. We will try to become market leader in this segment.

Long term market (Horizontal)

Horizontally we will be looking into social space where our products will have higher adoption and virility. This is highly crowed market where lot of social and engagement products already exist. This could be a good horizontal market to look into in a the longer term.

Product Market Fit & Value Prop

Value Proposition

The Value Proposition Canvas is a tool which can help ensure that a product or service is positioned around what the customer values and needs.

The Value Proposition Canvas is formed around two building blocks – customer profile and a company’s value proposition.

Customer Profile

Gains – the benefits which the customer expects and needs, what would delight customers and the things which may increase likelihood of adopting a value proposition.
Pains – the negative experiences, emotions and risks that the customer experiences in the process of getting the job done.
Customer jobs – the functional, social and emotional tasks customers are trying to perform, problems they are trying to solve and needs they wish to satisfy.

Value Map

Gain creators – how the product or service creates customer gains and how it offers added value to the customer.
Pain relievers – a description of exactly how the product or service alleviates customer pains.
Products and services – the products and services which create gain and relieve pain, and which underpin the creation of value for the customer.

Product Market Fit

The Product/Market Fit Canvas is a strategic innovation tool. It allows you to define, validate and reach your customers. It also allows you to define and iterate your product to achieve market validation with your product.

Risk Mitigations

Following are the Risks involved and a short summary of how we mitigate those risks with our solution.

Privacy and user data protection (Cyber Security, Hackers attack, Data protection policies)

We have designed our solution with security in mind. We have used W3C technical standards and protocols that are highly secure and provide end-to-end encryption. We have taken extra care to make sure that all communication is secure and any user’s data is stored in a secure format in our databases. We also have a cyber security advisor Saurav Sengupta (https://www.souravsengupta.com/) who is helping us make sure that our infrastructure is secure. Apart from network security, we have added several application level security features like Password protected rooms, Role based access, Host controls like mute participant, pause video of a participant, remove a participant from a conference, put participants in a waiting room, and many others. Moreover, Unlike other video conferencing solutions, We default to the highest security setting for any meeting/webinar. Hosts can turn off some of the application level security features if they want to. We offer on-premise hosting solution which allows you to host all the backend infra and the database in Govt’s owned data center. This allows full-control over all the data and prevention from any unwanted access.

Scaling the video communication infrastructure

We have designed our backend infrastructure to scale to millions of concurrent connections. Our infrastructure is capable of supporting more than 500 participants per meeting room which can be scaled even further by adding more SFU servers. We use a concept called SFU cascades wherein a conference can span across multiple SFU servers. This is why we’re not limited by the CPU and bandwidth of a single SFU server. All our backend servers are horizontally scalable. To handle more load, we can add more servers and the infrastructure will be able to support additional load.

We also support multi-datacenter deployments by default. The users are connected to the datacenter closest to them to improve latency.

Low network and bandwidth

India, being a developing nation, doesn’t have high speed internet at all places. We believe that technology should empower everyone whether they are living in urban cities or remote villages with slow network connection. Our solution detects network conditions and automatically upscales or downscales video/resolutions and bitrates to make sure that we provide optimal experience to the users. Our media server (SFU) that handles video/audio communication supports Error correction, handles packet loss, and makes sure that it forwards the video/audio streams to the receivers depending on their bandwidth and display.

AI/ML

Some of the technical hurdles we would face will be in the field of AI/ML.

We are building closed captioning as part of our solution. Building this requires large training datasets in multiple Indian languages which is difficult to get since the area of AI based speech-to-text is very new. We plan to collaborate with Indian academic institutes for our data needs.

In order to develop our AI solutions, we would require powerful compute resources in the form of GPUs to perform parallelised large scale mathematical operations on our datasets.

Once we have developed the closed captioning solution, we need to make sure that it can perform speech-to-text tasks with reasonable latency. In this regard, we are evaluating multiple state of the art frameworks and open source solutions to make sure that all the performance requirements are met for a smooth user experience.

Apart from this, there are challenges involved in the cyber security field to make sure that we leave no stone unturned when it comes to security. We have taken extra measures to ensure that all the communication is end-to-end encrypted and have employed several application level security measures. We are also in touch with experts in the cyber security domain who are willing to guide us.

Product Roadmap

The product roadmap is split into two pillars.

Vertical: We will build for a general use-case that can be used regardless of what sector or who uses our product, we will grow these features on the sole of our Vision - To make remote communication better.

Horizontal: We plan to eventually be a marketplace where we give developers access if required to build on top of our E2E encrypted, remote conferencing video product. These efforts, however, have to start in the house, which means that we need to build a few features and products to begin this innovation within the product.

Expansion plans

We have a reasonably diverse team that we plan to work with covering various areas from artificial intelligence to cybersecurity. We plan to have a lean team until we raise revenue & achieve product-market fit. We plan to build our distribution strategy by starting with our friends and move to a further bigger circle by making sure our customers have a smooth journey while adapting us. (More details in the diagram below). Once we have reached a certain scale and are open to exploring our horizontal pillars, we plan to expand our team close to where our sales prospects take us. Then we plan to add sales, marketing & customer success efforts.

Collaborations/tie-ups:

We would want to collaborate with educational ministry or other sectors once we scale horizontally. This is still an exploratory space; once we have a fit for our product, we can expand more on this.

So in the roadmap, we will address both of these pillars. We will focus on vertical pillars in the short term to achieve product-market fit. We will, in the long term, move to unbundle segments and add more horizontal pillars.

Vertical

The general use-cases that can be used regardless of what sector or who use our product, we will grow these features on the sole of our Vision - To make remote communication better.

Signups

For Signups, we have thought to keep 1-1 calls open & no signup process required. However, to schedule calls in the future, we will have social(Facebook,google) signups or build our login system. This will mainly be done so out users can log in & see their contacts & their schedules at their convenience.

Conference rooms

With any service, we want our remote conference should help have various kinds of conversation & not just be used for meetings, for this purpose we want to introduce various rooms.

Closed Room - A password-protected room that is invite-only.
Open Room - A water-cooler space, which is open all the time for any on & off discussions within the remote work environment.
Broadcast Rooms - A room that will be used to deliver an overall message, in the case of offices, it could be a broad CEO’s message to the employees.

User Privacy

As an application we are fully encrypted and private; however, we also aim to provide User-centered design which means that we want to give explicit user controls and that is what this section of features will do

Audio - Ability to Mute /Unmute audio.
Video - Ability to switch on/off
Pin - preview

Beautification & Delight

We believe while we are faced with a world of remote, it doesn't take away from the need to be ready. Prepared, professional and easy to adapt. This is why we are introducing these features.

Background change - Ability to change the background to a few templatized images.
Image - Ability to change your profile picture.
Background blur - Focus on the speaker is vital, so we give an ability to blur the background.

Accessible knowledge

It is vital to keep communication beyond conferencing & calls. For these reasons, our next set of features focuses on making information more accessible.

Transcript - A fully automated transcript via speech recognition.
Closed Captions - Providing subtitles to every conference
Minutes of the Meeting - Automated minutes of all the meetings, by analysing transcripts with Natural Language Understanding (NLU)

Horizontal

We plan to eventually be a marketplace where we give developers access if required to build on top of our E2E encrypted, remote conferencing video product. These efforts, however, have to start in the house, which means that we need to build a few features and products to begin this innovation within the product.

Organisational security & Privacy

As we build for various organisations, it is important to give them access controls and

Org Admin Roles and controls - Giving access to admins of each organisation & give controls in terms of what users have what controls. Super Admin, Admin, User levels are well defined.
Account switching (Personal/Professional/multi-account) - We aim to be organisational as well as personal; hence the ability for a user to use the tool for personal should be effortless.
Chat history/Room history - Ability to control how much information about confidential conferences can be given to different users.

Data Privacy

Contact list - Open/closed access to the contact information within the organisation.
Schedules, Calendar - Open/closed access to the schedules/calendar information within the organisation.
Personal meeting ID - Password Protected meeting IDs in order to protect data & privacy.

Integrations:

For an organisation in any sector to work well, the communication tools have to be integrated with other channels as well. We aim to do that to ease the organisation to work efficiently.

Chat integrations - Integrating chat apps to remind you about the conference calls.
Browser integration - Introducing browser extensions to access & notify about the conference calls from the web.
Calendar integration - Integrating Calendar apps to remind about the conference calls & allow to add the conference link to calendar meetings as well.

Freemium based model

We plan to start with the freemium model, but we want to provide a tiered pricing strategy eventually. A paid small org, paid mid-org, paid large-org will be on different tires, and the features we open will be dependent on size & data usage.

We aim to provide a social presence within our application so to have more engagement within the users.

Channels - Different channel conversations to deal with the interest of the users. Eg: #technology, #dance etc.
Membership - People could take free/paid membership of any channel.
Donation - Allowing people to donate to bigger causes around them, and create social impact.

Event organising

Organising catchups, webinars & interviews will be at the centre of remote collaboration.

Create an event - Ability for users to create events & share it with interested people easily.
Webinar - Ability for users to easily create events & share it with interested people.
Anonymous Voting feature / Poll - Ability to vote/ poll in interesting, community shared topics.

Collaboration

Every communication is incomplete without powerful collaborative tools so we will expand on this.

Interactive Board - Ability to collaborate and discuss on top of video calls.
Presentation mode - Ability to present your ideas and discuss on top of video calls.
Screen highlight - As we explore education as a big sector, we think in teaching & collaboration, it will be powerful to highlight texts on screen.

Enhancements

While we explore the fun aspects that remote communication would bring, we feel it is important to provide certain hooks through enhancements to keep people involved.

Filters - Beautification of participants by providing filters to enhance appearance.
Masks - Adding fun elements on top of videos.
Reactions/gifs - Adding integrations with gif websites to react or comment on the video/audio call.

Tech Documentation

Our Solution

We're using WebRTC for Real-time audio and video communication and WebSocket for sending chat messages and initiating a video/audio call. We also support Live streaming using WebRTC which is broadcasted to other users using HLS.

Process Flow

We have added process flow diagrams for all the flows like Creating/Joining a meeting room, Initiating a Video/audio call, Sending/receiving chat messages, and Live streaming in the backend architecture section.

Following diagram describes the high-level process flow for video conferencing.

Users interact with one of our mobile applications or web clients to start/join a video conferencing. The client first establishes a WebSocket connection with the Gateway server. If user is already authenticated, the client sends the authentication token in the request header which is validated by the Gateway server.

Once a WebSocket connection is established successfully, the client makes a request to create/join a room. The Gateway server calls the Room server over a GRPC connection to create a room and generate a password for the room, or validate the roomId and password if the user is trying to join an existing room.

Once the user joins the room, the client creates a WebRTC SDP offer to establish a WebRTC connection with the media servers for exchanging video/audio streams. The Gateway server receives the SDP offer and calls the Room server to find a media server where the user’s offer should be sent to.

Room servers constantly monitor media servers through service discovery mechanism and find the best-suited media server for a user. Once the media server is identified, the Gateway server sends the user’s offer to the media server. The media server responds with a WebRTC SDP answer which is sent back to the client.

Once the offer/answer flow is complete, the client establishes a persistent WebRTC connection with the media server and starts sending video/audio streams to the media server.

Media Server is the main component that receives video/audio streams from all participants in a room and sends those streams to the other participants in the room.

Backend Architecture

Authentication

We allow guest users as well as authenticated users in our application. The Gateway server is responsible for authenticating users and validating every incoming request.

To authenticate users, we use both email/password-based login as well as OAuth2 based login using Google and Facebook. All the Passwords are hashed using bcrypt and stored in our database. Once a user successfully logs-in through either email/password or Google/Facebook, we create a JWE token and send it back to the client. The client sends this token in an Authorization header along with every request to the server.

Creating/Joining a Meeting Room

When a user wants to create a new meeting room or join an existing room, the client first connects to the Gateway server using a secure WebSocket connection. The WebSocket connection is maintained throughout the life cycle of the meeting. It is used for sending chat messages and other events like user-join, user-leave, etc.

Once the WebSocket connection is established, the client sends a request to the Gateway service to join or create a room. Gateway server calls Room server to create a room or validate the RoomId and password if the user is trying to join an existing room. When a user successfully joins a room. A ‘user-join’ event is sent to all the other participants in the room.

Initiating a Video/Audio Call

After joining the room, the client initiates a Video/audio communication by creating an SDP (Session Description Protocol) offer and sending it to the Gateway server. The Gateway server calls Room server to find the Media Server where this user should be sent to. Room servers constantly monitor media servers through service discovery mechanism. It tries to assign all users in the same data centre to the same media server (up to a certain limit) to improve efficiency. After getting the response from Room server, the Gateway server sends the SDP offer to the media server. The media server responds with an SDP answer which is sent back to the client.

After the offer/answer flow, the client establishes a secure WebRTC connection with the media server and starts sending Audio/Video streams to the server. The media server routes the audio/video streams of every user to every other user in the same room.

Managing top-notch Quality and Performance

The Media Server also called an SFU (Selective Forwarding Unit) is the core component which is responsible for transmitting audio/video streams between the participants in a room.

Following are the characteristics of the SFU:

Low delay: The SFU only forwards the video. So it adds minimal additional delay to the video/audio streams.
Selective Forwarding: Our SFU is selective. It picks and chooses, from the incoming video stream, a subset to forward to the receiver. And that subset could be different for every one of the receivers. So depending on the receiver’s available bandwidth or display, it will pick a subset of the video streams to forward to the user. If the receiver is watching a participant in a big window, it will choose a high resolution for that video stream. If a participant is hidden in the receiver’s display, the SFU won’t send the video stream of that participant to the receiver.

Apart from the optimisations at the SFU server, our client systems are also intelligent to adapt the bit-rate, choose the correct video resolution given the network conditions. Our clients also provide users, the ability to select the video resolution that they want to use.

Sending & Receiving chat messages

To send a message to other participants in a Room; a user sends the message to the Gateway server that he is connected to. The Gateway server forwards this message to the Room server. Room server first stores the message in the database. It then looks up in the redis cache and finds out all the Gateway servers who have users that are part of that room, and sends the message to those Gateway servers. The message is finally delivered to the individual participants by the Gateway servers over the WebSocket connection.

The Room server acts as a message router, and the Gateway servers manage WebSocket connection with individual users.

Managing scale

We have designed our architecture to scale to virtually any number of participants per room. Our architecture also supports multi-datacenter deployments without any tweaks to the backend services.

There are three main components in our application - Gateway server, Room server, and Media server. All of these servers are horizontally scalable. The media server is responsible for receiving and forwarding audio/video streams. Let’s understand how we scale the media server.

Each media server periodically reports its health and load, and this information is curated and placed into our service discovery system. Whenever a user joins a new room, the Room server watches the service discovery system and assigns the least utilized media server to that user. Any other user joining the same room is also sent to the same media server as long as the user is connected to the same data centre and the media server is capable of taking more load. If the media server can’t take more load or the user connects to a different data centre, then the user is sent to a media server which is geographically closer to him and is able to take on more load. Once we do this, we have one conference room spanned across multiple media servers. In this case, we set up a server-to-server relay between the existing media server and the new media server. This makes sure that any audio/video stream sent to the first media server is relayed to the second media server and vice versa.

Handling Failures

We have taken extra care to make sure that there is no single point of failure in our system. There are multiple instances of each server running in the backend. Every server instance reports its health to the service discovery system (Consul). If an instance crashes, it is removed from Consul, and the backend adapts accordingly.

We have also set up alerts using Prometheus and grafana to get notified whenever any component in our architecture is experiencing issues like high CPU or memory usage, or general application errors like increased 5xx errors.

Live Streaming

We support live broadcasting from your device to the outside world. The Audio/video stream is captured at the local device and sent to the HLS gateway over a WebRTC connection. We transcode the video at the HLS gateway and send it to CDN servers from where the video stream is distributed to other users via HLS.

Security by Design

We have taken utmost care to make the application as secure as possible. We can divide security features into:

Transport level security
Application level security

Transport level security

WebRTC provides end-to-end encryption of our video/audio streams. It uses SRTP (Secure Real-Time Transport Protocol or Secure RTP) on top of DTLS (Datagram Transport Layer Security) to provide end-to-end encryption.
HTTPS/WSS - All connections to the Gateway service happen over Secure Websocket and HTTPS connections.
Databases - All the data in our database are encrypted.

Application level security

We provide many application-level security features to secure any meeting from unwanted access. We have -

Password protected rooms
Role-based access
Host controls - Host can mute a participant, pause their video, remove a participant from the meeting, allow/disallow screen sharing etc.

Unlike other video conferencing solutions, We default to the highest security setting for any meeting/webinar. Hosts can turn off some of the application-level security features if they want to.

Closed Captioning

We’re exploring various approaches and open source ML frameworks to perform closed captioning. There are four steps to achieve close captioning with multilingual support

Obtain a transcript of the content.
Translate the transcript into the target language.
Create the corresponding subtitles from the transcript or translation and create a subtitles file.

We’re currently trying out a POC based on DeepSpeech () which is a TensorFlow implementation of the Baidu’s DeepSpeech architecture, Mozilla voice () and other language translation machine learning models.

Device and Audio/Video codec support

Works on Mobile apps, web browsers, and Desktop apps.
Supports all the audio/video codecs like VP8, VP9, H264, Opus, G722, PCMU, PCMA, etc.
Supports adaptive bit-rate, automatically chooses the right video resolution as per the network conditions. Allows users to choose the video resolution.

Interoperability with existing technologies

We have chosen technical standards that are mainstream and have huge support in web browsers and mobile applications. WebRTC is supported in most modern web browsers and mobile applications. It is used by some of the well known video conferencing products like Google Duo, Facebook messenger, and Microsoft Teams.

Similarly, WebSocket is the de-facto standard for by-directional communication between clients and servers.

Other technologies like HLS (HTTP Live Streaming) are chosen to provide support for all kinds of devices. Though Mpeg Dash is a newer technology than HLS for live streaming, We chose HLS because Apple doesn’t support Mpeg Dash yet.

Resource Requirement & Management

Our backend solution works on general purpose commodity hardware. We don’t require any specialised hardware.

We recommend general purpose Linux based VMs with at least 4 CPU cores, 16GB RAM, 2.5 GHz Intel Scalable Processor, and network performance of 5Gbps.

Platform dependence

Our solution works across web browsers, desktop, and mobile applications. For deploying the backend infrastructure, we recommend Linux based VMs.

Technology Stack

Golang: Server side programming language Go is widely used on the server side to build highly concurrent and scalable applications. Our team has a track record of developing scalable products using Go
WebRTC: For Real-time video and audio communication WebRTC is a free and open source framework that enables Secure Real-Time communications (RTC) over web browsers and mobile applications. It is developed by Google. Google Duo, Facebook messenger, and numerous other video conferencing apps are built using WebRTC.
WebSocket: We’re using WebSocket to enable the initial handshake before a video call starts and also to send and receive chat messages between all the participants in the conference.
Consul: Consul is used for service discovery and load balancing between our backend server clusters.
Postgres: To persist user’s and room’s data
Redis: Redis is used to cache the currently active room’s and user’s information.
HAProxy: For load balancing requests to the Gateway server
GRPC: GRPC is used for Inter-service communication.
GStreamer/FFMpeg: For Audio/Video mixing, processing, and transcoding
HLS: For Live streaming
React: React is our choice of framework for building the web client
Flutter/Electron: For building the desktop client.
Java (Android)
Swift (iOs)
AI/ML frameworks: Tensorflow, TensorflowJS, Pytorch

VOIP (Voice Over IP)

Real-time communications on the internet has been on the rise in the recent past. Many companies are offering video conferencing solutions for social as well as enterprise users. Real-time communications are powered by VOIP (Voice Over IP) protocols. VOIP protocols include;

Session Initiation Protocol (SIP), connection management protocol developed by the IETF
Real-time Transport Protocol (RTP), transport protocol for real-time audio and video data
Real-time Transport Control Protocol (RTCP), sister protocol for RTP providing stream statistics and status information
Secure Real-time Transport Protocol (SRTP), encrypted version of RTP
Session Description Protocol (SDP), a syntax for session initiation and announcement for multimedia communications and WebSocket transports.

WebRTC

WebRTC is a peer-to-peer, secure communication protocol built on top of UDP. It supports video, voice, and generic data to sent between peers without requiring any plugins or third-party software.

One of the significant milestones in the video conferencing area was the release of WebRTC. This open-source project provides web browsers and mobile applications with real-time communication (RTC) capabilities.

WebRTC powers many well-known video conferencing software like Google Duo, Facebook Messenger, Microsoft Teams.

WebRTC’s protocol stack includes UDP, ICE, STUN, TURN, DTLS, SRTP and SCTP.

ICE, STUN and TURN are necessary to establish and maintain a peer-to-peer connection over UDP. DTLS is used to secure all data transfers between peers, as encryption is a mandatory feature of WebRTC. Finally, SCTP and SRTP are the application protocols used to multiplex the different streams, provide congestion and flow control, and provide partially reliable delivery and other additional services on top of UDP.

Scaling WebRTC

WebRTC is a peer-to-peer protocol. This means that every participant in a conference maintains connection with every other participant creating a mesh topology. This kind of architecture doesn’t scale well beyond 10 or 12 participants. This is the reason why many of the WebRTC powered apps have a limitation on the number of participants.

WebRTC deployment topologies

By default, WebRTC connections form a Mesh topology since it is a peer-to-peer protocol. To scale it better, we use either an SFU or an MCU based architecture. Both of these form a star topology.

SFU (Selective Forwarding Unit) and MCU (Multipoint Control Unit)

To scale WebRTC to thousands or even more participants in a conference, an SFU (Selective Forwarding Unit) or an MCU (Multipoint Control Unit) is used.

In an SFU based architecture, every participant in the conference sends his video/audio to the SFU, and the SFU forwards that video to other participants in the call. SFU selectively forwards the right stream to other participants depending on their bandwidth or display.

MCU is a more sophisticated media server than SFU. It mixes the audio/video sent by each participant and transmits it to the other participants over a single stream. MCU works well for low-end mobile devices, but it requires more processing power on the server and is usually costlier than an SFU. This is the reason why SFU powers most real-world WebRTC deployments.

Scaling an SFU

SFU Capacity/Conference Size

Sooner or later, you’re going to run into the capacity limits of the machine that’s running the SFU. It could be the CPU or bandwidth (Mbps).

Geo-distributed connections

Calls connecting from multiple geographies to an SFU hosted in some remote zone.
High latency. Expensive intern-regional connections

Cascaded SFUs (used to solve scaling challenges)

Two or more SFUs that are interconnected in such a way that one conference can span multiple SFUs.
Participants can join any one of the SFUs in the cascade and seamlessly interact with all the other participants in the conference, regardless of whether they are on the same SFU or not.
Cascade can be used to create a conference that dynamically grows to virtually any size as participants join.

Geo-distributed SFU cascades

Use some location-based routing algorithm to connect users to the SFU closest to them.
When a request is made to join the same conference on two separate SFU clusters, you can, on-the-fly, cascade them and create one conference that spans both geographies.
The local SFU cluster will forward only one copy of the video to the remote SFU cluster which then forwards it to all its locally connected participants.

Open Source WebRTC SFUs/Gateways

Following are some of the most popular open-source WebRTC gateways.

Janus: Written in C. Supports recording and streaming. Can manage around 250 participants in a conference room.
Jitsi: Written in Java. It uses XMPP for signalling.
Mediasoup: Written in Node.js. User libwebrtc c++ library under the hood.

References

AI in Video Conferencing

AI for Security

Recent advances in the field of Computer Vision has led to a multitude of applications in the field of facial/fingerprint/voice recognition. Deep neural networks based on open source libraries like Pytorch and TensorFlow can be used to develop highly accurate facial recognition systems. This can be used for adding an extra layer of authentication when customers join a meeting room.

AI for Accessibility

Advances in Natural language processing have made implementation of real-time language translation a reality. State of the art open-source Neural Machine Translation (NMT) systems are able to understand human language and translate them into multiple languages. Open source solutions are available for . Advances have also been made in the field of real-time translation of which can help people with hearing challenges.

AI for Efficiency

Efficient communication requires virtual meetings to happen without distractions. Distractions can be in the form of audio or video like background audio noise or visual background of the participant. Visual backgrounds can be removed real-time using deployed on the edge devices using TensorFlowJS. Noise is a form of audio distraction, and machine learning algorithms can help in real-time removal of stationary and non-stationary audio noise.

AI for User Experience

AI can be used to incorporate a whole host of features to bring delight to end-users. These include (not limited to),

Automatically capturing minutes & highlights of the meeting in the form of audio, video and text for records and sharing.
Chatbot support to provide minutes of the meeting for absentee participants.

Features & Roadmap

Product Functionality Buckets

These are product functionality buckets that we can use to prioritise and build features. Every product feature will be around Communication and Security, these could be core of product functionality.

Basic Functionality

Basic functionality is something which is require to have a basic communication, whitout these basic functionality users can't use the app.

Core Features

Core functionalities are which are part of our core philosophy (Communication & Security)

Enhancements Features

Enhancements are additional features which will enhance the Basic and core functionalities.

Delight Features

Delight features are the one which gives WOW to the users and make them excited while using the product.

MVP (Minimal Viable Product)

MVP is the smallest solution that delivers customer value. We should take approach to build MVP before committing any big resource investment.

Hackathon Roadmap

Here is the Product Roadmap from inception to long term feature building

Product Roadmap Phase -1 (Ideation)

Existing Technologies Backend Research Competitive Technical Analysis

Exploring Communication as domain Existing products study Market Study Benchmarking

Backend - Done Frontend - Done UX/UI Design - Done DS/AI/ML - Done Cyber Security - Done Product Manager - Done App Developer - Done

Product Market Fit Challenges Feature Listing UX/UI Design Flows

Home Page Start a call Join a call

Video Call Interface Grid Layout Audio Controller Video Controller Invite meeting link/Meeting ID Leave Call

Product Roadmap Phase-2 (Prototype)

Home Page Signup Login

App Interface User Profile Join a Call Start a Call

Video Call Interface List of participants Chat Screen Share File Sharing

Home Page Login/Signup with Google

App Interface Import Contacts from Google Add Contacts Schedule a call List of scheduled calls Invite with email ID

Product Roadmap Phase-3 (Product Building)

Org Setup Admin Panel User Management Admin Controls Dial In Important Integrations like calendar, bridge calling, existing tools etc

Deployment on Gov Cloud Scalability Testing/Load Testing Performance Testing Debugging UI Polishing

Yearly Roadmap

Product & Tech Work on products functionalities, Scalability, Market Ready Product, Polished UI/UX.

Marketing Strategy We will work on different go to market strategies and marketing mediums to optimise and minimise the marketing cost while doing impactful marketing of product.

Sales Strategy Work of different sales channels and sales pitches to make it most impactful.

Human Resource We will try to hire the passionate young talent in different domains who can work with us at same energy and pace and also learn from others expertise. We will try to minimise the hiring as much as possible to reduce the expenditure while fulfilling the product requirements.

Facecomm

Facecomm

hashtagWhat is Facecomm?

hashtagWho are we?

hashtagWhy choose Facecomm?

hashtagPrivacy & Security 🛡️

hashtagOn-Premise Solution 🏢

hashtagAdaptive Communication 📡

hashtagUnified communication interface 🧩

hashtagVernacular 🔀

hashtagAccessibility ✅

hashtagCommunication Inclusion 📞

hashtagSimplified User Experience 😍

hashtagDeveloper API ⚙️

hashtagWhat are the features you provide?

Business Docs

About Us

hashtagOur Vision:

hashtagOur Mission:

hashtagOur goals:

hashtagOur Values:

Our Team

hashtagBackend

hashtagCyber Security Advisor

hashtagData Science/ AI / ML

hashtagFrontend

hashtagApps

hashtagProduct & Design

Rajeev Singh

Amit Rai

Sourav Sen Gupta

Shreyas Mangalgi

Parminder Singh

Renish Bhimani

Vindhya Chandrasekharan

Nitin Rana

Understanding Communication

hashtagWhat is Communication?

hashtagHow humans evolved with communication?

hashtagModern Communication

hashtagModes of communication

hashtagVerbal 💬

hashtagNonverbal 👌

hashtagWritten ✍️

hashtagVisual 🎨

hashtagWays of communication

hashtagDirect communication 🗣️

hashtagRemote communication 📱

hashtagType of communication

hashtagNon - Professional (Personal) 😎

hashtagProfessional 👨‍💻

hashtagNature of communication

Market Study

hashtagGoogle Duo

hashtagUSPs of Google’s approach

hashtagCons of Google Duo

hashtagTechnologies used by Google Duo

hashtagFacebook Messenger

hashtagUSPs of Facebook messenger

hashtagTechnologies used by Facebook Messenger

hashtagMicrosoft Teams

hashtagReferences

Business Use Cases

hashtagEducation

hashtagPublic Health

hashtagMedia & Entertainment

hashtagBanking & Finance

hashtagPublic Security and Safety

hashtagAgriculture

hashtagJudiciary & Legal

hashtagManufacturing

hashtagRecruitment

hashtagProfessional Services

hashtagMarketing

hashtagSales

hashtagState & Local Government

hashtagSocial Fun & Engagement

hashtagEvents & Webinars

hashtagCustomer Support

hashtagHome Security

What is Facecomm?

Who are we?

Why choose Facecomm?

Privacy & Security 🛡️

On-Premise Solution 🏢

Adaptive Communication 📡

Unified communication interface 🧩

Vernacular 🔀

Accessibility ✅

Communication Inclusion 📞

Simplified User Experience 😍

Developer API ⚙️

What are the features you provide?

Our Vision:

Our Mission:

Our goals:

Our Values:

Backend

Cyber Security Advisor

Data Science/ AI / ML

Frontend

Apps

Product & Design

What is Communication?

How humans evolved with communication?

Modern Communication

Modes of communication

Verbal 💬

Nonverbal 👌

Written ✍️

Visual 🎨

Ways of communication

Direct communication 🗣️

Remote communication 📱

Type of communication

Non - Professional (Personal) 😎

Professional 👨‍💻

Nature of communication

Google Duo

USPs of Google’s approach

Cons of Google Duo

Technologies used by Google Duo

Facebook Messenger

USPs of Facebook messenger

Technologies used by Facebook Messenger

Microsoft Teams

References

Education

Public Health

Media & Entertainment

Banking & Finance

Public Security and Safety

Agriculture

Judiciary & Legal

Manufacturing

Recruitment

Professional Services

Marketing

Sales

State & Local Government

Social Fun & Engagement

Events & Webinars

Customer Support

Home Security

Current Positioning

Future Prospect (Vertical)

Long term market (Horizontal)

Value Proposition

Product Market Fit

Privacy and user data protection (Cyber Security, Hackers attack, Data protection policies)

Scaling the video communication infrastructure

Low network and bandwidth

AI/ML

Expansion plans

Collaborations/tie-ups:

Vertical

Signups

Conference rooms

User Privacy

Beautification & Delight