Software Based Conferencing - Can Standard Servers Deliver
A White paper by Håkon Dahle, CTO, Pexip
December 16, 2013

Multipoint video conferencing has always required a significant amount of processing power in order to deliver a decent customer experience.

Over the last twenty years, we have seen a tremendous development in performance, scalability and user experience. The first multipoint video conferencing systems were custom architectures with application specific processors, which were complex and difficult to program.

Over time, we saw how the industry moved to less esoteric CPU architectures which were lower cost and easier to program, yet still required custom hardware designs. Only lately has it become possible to deliver the required performance and scale using off the shelf servers with standard Intel processors.

With the relentless performance improvements available in standard server designs we expect custom hardware based solutions to disappear over the next few years.

Figure 1: The third generation MCU is simply software running on standard off-the-shelf servers.

First Generation MCUs:

The 1990’s and application specific processors

Standards based conferencing really started in the early 1990’s, with the H.320 and H.261 protocols. As endpoints started to gain acceptance among early adopters, there became a need for a way to arrange large multipoint meetings. Companies that saw success delivering Multipoint Conferencing Units (MCUs) in the 1990’s were VideoServer and Accord. Both of these vendors used custom hardware based on application specific processors from the US processor manufacturer IIT (later renamed 8x8).

Second Generation MCUs:

Custom hardware using standard DSPs

In the late 1990’s and early 2000’s the industry saw wide adoption of VLIW (Very Long Instruction Word) processors such as Philips Trimedia and Equator BSP being widely adopted in numerous endpoints. However these processors were not adopted as widely in MCU products.

 However the early 2000’s saw the rapid success of Texas Instruments (TI) in this space, where their C6000-series and DaVinci-series DSPs (Digital Signal Processors) went on to replace the VLIW processors. By the end of this decade, most vendors delivered MCU products based on TI DSPs: Codian (later acquired by Tandberg who was in turn acquired by Cisco), Radvision and Polycom.  Common to all these products were custom hardware architectures either in pizza-box form factors or as large chassis-based systems which carried big price tags, and which typically faced obsolescence after 4-6 years. Complicating these systems further was the fact that some used DSPs with dedicated hardware accelerators for H.264. This means that some of these products could not easily be programmed to support new video codecs such as VP8, VP9 and H.265. For the end user this will eventually mean that these expensive hardware platforms can not be upgraded to support recent developments in video conferencing technology – a forklift upgrade will be required.

Third Generation MCUs:

Performance and scale, software on industry standard servers

While custom hardware architectures can provide excellent performance, they are expensive to develop and development cycles are long. Using software on standard servers seems like a reasonable idea. In fact, with the latest processors from Intel, and in particular with Intel’s “Sandy Bridge” processors which started shipping in 2011, industry standard servers are now suitable platforms for media intensive applications. Instruction set extensions such as SSE (Streaming SIMD Extensions) and AVX (Advanced Vector eXtensions) together with hyper-threading and an ever increasing number of processor cores on a single die, allow for performance even better than custom hardware architectures  using ASICs, FPGAs and DSPs.

As an example, using a standard 1RU type server from any major vendor (HP, Dell, IBM, Cisco etc) with dual Intel E5-2600 series CPUs, each with 8 cores at 2.7GHz, Pexip can deliver 32 ports of high definition conferencing. In terms of “HD ports per rack unit” this compares extremely well with the traditional custom hardware designs.

For even higher density, using blade servers it is now possible to get more than 1000 ports of true HD conferencing in a mere 10RU of rack space.

Furthermore, with the recent introduction of Intel’s E5-2600v2 series of processors, we see yet another improvement in performance. A standard 1RU server configured with dual Intel E5-2600v2 processors, each with 12 cores at 2.7GHz, can deliver 48 ports of 720p30 high definition conferencing, where a port can be any video codec – H.263, H.264, H.264SVC or VP8.

While this is the latest offering from Intel, the next generation architecture (codename “Haswell”) is already available in desktop laptop computers. When this architecture becomes available in a dual or quad socket server CPU design we should expect yet another step-change in performance. As an example, the new AVX2 instruction set extensions will double the integer performance for many core video processing algorithms.

The impact of Moore’s Law on the future of conferencing

The development of Intel’s Sandy Bridge, Ivy Bridge and now Haswell are good examples of Moore’s Law. Moore’s Law states that the number of transistors on a chip doubles every 18-24 months. While the end of Moore’s Law has been predicted several times, the observation has shown to be true for the last 40 years.

The implications for Pexip customers are important: By using the latest server designs, they can expect to see a doubling of port capacity every two years. We have already shown progress from Intel Sandy Bridge (32 ports per RU) to Ivy Bridge (48 ports per RU). Get ready for another increase in performance and capacity as Haswell becomes available some time in 2014.

Moores law

Figure 2: Moore's Law shows that processor transistor count doubles every 18-24 months. Curve shows transistor count for popular microprocessors and their time of introduction. Illustration courtesy of Wikipedia.

Other benefits of software based conferencing

However there is more to conferencing than just processor performance. There are a number of other benefits to this approach which will be covered in separate white papers, we will just mention two important aspects here:

  • As video conferencing becomes a mission-critical collaboration tool, availability and reliability becomes more critical. By leveraging software and virtualization, the cost of having a standby server is dramatically reduced, compared with the cost of having a second custom-hardware MCU chassis on standby. Furthermore, with VMware tools such as vMotion and High Availability, one can enable yet another level of resilience which has never been the focus of these custom hardware architectures.
  • Enterprises are adopting virtualization as a key part of their data center strategies. This will reduce costs, consolidate resources, streamline management and deployment. For these customers, the ability to run conferencing as just another data center workload is extremely attractive: Conferencing can now be deployed, managed and monitored across the globe.

Conclusions

Software based conferencing today delivers performance and scale equal to or better than custom hardware architectures. Moore’s Law indicates that increase in performance and density will continue. In addition, software based conferencing allows IT professionals to view video conferencing as yet another data center workload, and reap all the benefits that standard data center and virtualization tools allow in terms of reduced cost of ownership, ease of deployment, ease of management, increased reliability and optimal usage of resources.

References

  1. http://www.wainhouse.com/files/wrb-05/WRB-0527.pdf  “MXP is based on the newest chip technology from Philips TriMedia”…“Additionally, MXP is the architecture inside the TANDBERG MPS, the carrier class MCU announced by the company last month.”
  2. http://support.polycom.com/global/documents/support/setup_maintenance/pr... “The MPM cards perform the various RTP, audio and video processing functions on the RMX 2000. MPM cards are based on the ATCA standard, with a card manager (CM) and up to 26 720MHz TI DSP’s”
  3. https://www.google.com/patents/US6584077: “The programmable RISCIIT 150 maintains the host port 164, TDM interface 158 and pixel interface 166, and controls the H.221/BCH 156, Huffman CODEC 154 and other peripherals internal to the VCP. The VP5 152 performs the compression primitives and is controlled by the RISCIIT 150. For detailed information, see the IIT VCP Preliminary Data Sheet and VCP External Reference Specification.”
  4. http://newscenter.ti.com/index.php?s=32851&item=126425: “DALLAS (June 5, 2003) -- Texas Instruments (NYSE: TXN) (TI) today announced that RADVISION (NASDAQ: RVSN) chose TI´s advanced TMS320C6000™ programmable digital signal processors (DSPs) to power its new MVP media processor board, a key component of the company´s recently announced Multimedia Control Unit (MCU) version 3.  The MCU v3 is the company´s flagship solution for videoconferencing and rich media communications for enterprises, institutions and service providers.”
  5. http://www.frost.com/prod/servlet/press-release.pag?docid=104990756 : “Resulting from four years of R&D work, the Codian MCU 4500 Series utilizes the latest in chip technology from TI. The use of next generation digital signal processors (DSPs) enables the series to provide ten times the MIPs of Codian's MCU 4200 Series. ”
  6. http://media.freescale.com/phoenix.zhtml?c=196520&p=irol-newsArticle_pri... : TEL-AVIV, Israel – Designing with Freescale Conference – May 11, 2010 – Freescale Semiconductor’s high-performance MSC8144 multicore digital signal processor (DSP) has been selected by Radvision for use in its latest high-definition SCOPIA Elite 5000 Unified Communications Video Infrastructure Multiparty Conferencing Unit.
  7. http://www.businesswire.com/news/home/20031006005717/en/Equator-BSP-15-P... “CAMPBELL, Calif.--(BUSINESS WIRE)--Oct. 6, 2003--Equator Technologies, Inc., a leader in programmable system-on-a-chip (SoC) processors for digital media, surveillance and video communication applications, announced today its inclusion in the new VSX 7000 video-conferencing system from Polycom, Inc”
  8. http://www.edn.com/electronics-news/4360491/Finally-the-Dawn-of-TriMedia- “The TriMedia is a media processor based on a very long instruction word (VLIW) architecture and targeted at being the "brains" to consumer, communications and computer applications that feature audio, video, graphics and communications datastreams.”…“Videoconferencing is one of three areas that Philips is looking at the TriMedia to be integrated into. Beyond Polycom, the company says it has signed up numerous other video conferencing systems;….”
  9. http://newsroom.intel.com/community/intel_newsroom/blog/2013/09/10/intel... Intel E5-2600v2 launch
  10. http://en.wikipedia.org/wiki/Advanced_Vector_Extensions Intel AVX and AVX2
  11. http://en.wikipedia.org/wiki/Moore's_law : “Moore's law is the observation that, over the history of computing hardware, the number of transistors on integrated circuits doubles approximately every two years. The period often quoted as "18 months" is due to Intel executive David House, who predicted that period for a doubling in chip performance (being a combination of the effect of more transistors and their being faster)”
  12. http://en.wikipedia.org/wiki/8x8: “In the early 1990s IIT began producing chips, software and other technologies for the videoconferencing market. Frustrated by the high prices and low volumes of these videoconferencing systems, the company changed its name to 8x8 and began marketing its own set-top videoconferencing systems for consumers under the ViaTV brand”