Introducing Cinegy Cinescore – Creating Baselines
Reading time ~12 minutes
Today we will give you a quick overview of Cinegy Cinescore and show you how it can be effectively used for the platforms benchmarking.
What is benchmarking and why is it important? Benchmarking gives better understanding of the current platform capabilities and ways to improve it.
It is important to note that benchmarking gives meaningful results when you define the goal - what you want to achieve with the help of the test. Just check how generic and universal the platform is or check if it is capable of performing specific tasks?
Once the goal is defined, run a series of tests to check the results.
Of course it would be useful to have some metric providing you with numeric results – how good or bad the platform behaves.
Test results are demonstrative, no doubt, but having just them does not give you the answer whether your platform runs efficiently within given conditions or something should be improved without changing the platform components. So it is good to have reference results from identical platforms.
Nevertheless the given platform may not be the best choice for the defined goal. Comparing it to other possible variants gives better understanding how platform components may affect the performance.
Once the comparison is made, the differences between the given and optimal platforms can be investigated to see whether the existing platform may be improved or it should be replaced.
Also note that bigger hammer in many cases is more expensive. So there could be a line where the platform is good enough in performing its tasks, and adding moderate extra costs will not improve the performance dramatically.
There are many benchmark types available depending on the chosen classification principle. In our case we will consider the following ones:
|CPU benchmarks that assess the processor performance through a series of tests.|
|GPU benchmarks that assess the graphics hardware performance through a series of tests.|
|Storage benchmarks that assess the local/remote drives performance through a series of tests.|
While there are a lot of tests available, we will choose a few reference ones that are illustrative enough:
- Passmark Performance test that gives an option to test CPU / GPU / Local Drives
- Maxon Cinebench that allows you to test CPU / GPU. Missing storage performance test can be replaced with IOMeter, for example.
- Cinegy Cinescore for CPU/GPU tests. Storage tests may be added in the next releases, so please, stay tuned.
The list is not complete, of course – there are other benchmark tools available on the market.
Passmark Performance Test
Let's check in more details the first selected tool – Passmark Performance Test. This tool is available for download from the corresponding and is free for use for the first 30 days. It has a quite nice set of tests available and enables comparison of the results against identical platforms or against a variety of other options.
As the tests are quite generic, they allow you to check the platform behavior as a whole, not focusing on specific requirements. Using this tool gives an answer to the main question: "Is the system working efficiently in general?". It does not help identify the broadcast capabilities of the system.
As CPU is not the only factor that affects platform performance, GPU tests should also be executed on the system. Using GPU resources efficiently ensures great reduce of the broadcast platform costs while keeping its capabilities and capacity.
Passmark Performance Test allows generic assessment of the system through a series of tests. Results provide general understanding of the system health but not for free.
It has fewer features than Passmark Performance Test but is based on the real rendering engine used to create films/movies special effects. So its results give slightly better understanding how good the platform is. Comparison to some reference platforms is available, although not so exhaustive as in the Passmark database.
The free Cinebench tool is based on Maxon Cinema 4D software that is used for 3D content creation. So its results are closer to the tasks daily performed by the broadcast systems but are nevertheless not intuitive.
In order to help broadcasters better understand capabilities of their platforms, Cinegy introduces a special benchmarking tool – Cinegy Cinescore.
It is aimed at performing broadcast specific assessments of the hardware and providing quantitative results that could be of use when defining the broadcast configurations for a particular platform.
Why Cinegy Cinescore?
So why Cinegy Cinescore better suites broadcaster needs? There are several reasons:
- It targets industry specific video encoding and decoding benchmarks, not a general computation performance of the system.
- It provides real numbers that can help understand platform capabilities in regards to broadcast oriented tasks.
- It is not bounded to Cinegy produced codecs. 3rd party codecs are included into the tests and the list will be extended in the future.
- The tests cover not only standalone device performance, but include also assessment of CPU/GPU/RAM data flow that may affect the platform performance dramatically.
Performing Benchmarking with Cinegy Cinescore
Once Cinegy Cinescore is launched, the current machine configuration details are displayed in the "About" section as well as a summary view of basic details about the current machine, including information about the operating system this computer is running, both central and graphics processing units models, the amount of random-access memory (RAM) installed:
On the right side there is a scores section that displays the estimated Cinegy Cinescore values for the machine. In case no tests are completed, the score will be zero.
To see the actual results, the benchmarking tests should be performed – press the "Run tests" button to see how tests are being performed. The average CPU/GPU load will be measured, and the CPU/GPU score will be calculated.
Once tests are started, the "Cinegy Cinescore PC Performance" dialog appears. The list of profiles to be tested is displayed on the left side. The profiles are grouped according to the TV format they are assigned to.
The intermediate results of profiles testing are displayed real-time on the right side. The current test progress is shown dynamically on top, while CPU and GPU utilization during the test and calculated frames per second value are shown below.
The values vary as the test progresses. The average value will be reported as the test result.
Profiles in the lists are marked with the following symbols:
|A profile that is currently being processed is marked with the orange spinning arrows.|
|A profile that is processed successfully is marked with the green check mark.|
|In case of any errors during the processing or if it is cancelled by user, the red exclamation mark will be displayed.|
So how are the tests done and why are the results topical?
The utility reads a sample footage into RAM to avoid any disk reading speed penalties during the test. Once buffered, the media is encoded automatically. This allows assessment of the real processing performance involving only CPU/GPU/RAM into the test.
The collected information is split into three tabs within the "Encoding results" section with the average characteristics of the frame rate, CPU, and GPU load values displayed accordingly.
The results for SD/HD/UHD/8K TV formats can be reviewed by clicking the corresponding tabs.
In most cases having results from a single machine is not enough to do any conclusions. So several reference machines may be added for comparison. To add such a machine, you can simply add previously saved results file from the list in the "Compare" menu:
Cinegy Cinescore comes with 3 pre-defined reference configurations to be compared against. The list of default reference machines will be extended in future versions.
You can also select reference machines for comparison by pressing the "Add" button in the "Compare" menu and selecting the *.csr file in the dialog that appears.
Average frame rate values are graphically represented on the "FPS (avg)" tab individually for each profile. These values provide you with quantitative measurement of your machine performance within the defined load that is typical of broadcast operations; for example, generating XDCAM style or HEVC encoded output stream. You can see that results vary greatly depending on the encoding type used, defined encoding settings and also the CPU type:
You can navigate through tabs to see additional details.
The results of the average CPU load on the machine are displayed individually for each profile on the "CPU load (avg)" tab.
The results of the average GPU load on the machine are displayed individually for each profile on the "GPU load (avg)" tab.
As exact machines configurations are important, the details on machines added for comparison are displayed here in the table view:
You can check exact OS, CPU, GPU, and platform versions used when results were collected. Analyzing this info gives deeper understanding what components can be tweaked to achieve better results.
Let’s have a closer look at the results we received and what information can be extracted from them.
CPU and GPU Values
The Cinegy Cinescore values for CPU and GPU give you a general idea how productive the machine is or what its "weight" is in comparison with other configurations. For example, you can see that with current graphics board on this laptop GPU score equals zero, so cannot use GPU offloading features for the broadcast.
This graph shows the average FPS level achieved by the machine:
Depending on the encoder type the results differ for the same machine. Typically MPEG-2 profiles are more efficient in terms of FPS than H.264 or HEVC ones when processed on CPU.
For example, our laptop is able to encode about 70 frames per second in XDCAM HD 422 at 1080@25i. At the same time only about 30 frames are possible for H.264 High Quality profile for the same TV Format.
Please take a look at the AVC-Intra results for different machines. As you can see we have significantly different results for 3 machine configurations. While some might assume that the difference appears because of GHz, but closer examination shows that the least performant CPU is 4 GHz while the top one is just 2.6 GHz. The real answer is the number of cores. i7-6700K has only 4 cores, E5-2650 has 8 cores and E5-2697 has 14 cores. AVC-I is the type of codec that can be effectively scaled between all cores. Same applies to Daniel2 and ProRes, for example. At the same time H.264 encoding results do not vary too much as H.264 cannot be efficiently scaled between cores due to the used algorithms nature.
Another important example: GPU offloading. As you can see, we achieved quite impressive results with more than 300 frames per second on a single machine for hardware accelerated H.264 encoding using NVIDIA GPU. So we can assume that with GPU offloading we can run about 8 HD channels or even more on a single box. At the same time the machine should not have the best CPU on the market as all the hard work is done by GPU. In case you compare results from the previous slide when CPU encoding was used, you can see that the price/performance ratio is much better when GPU offloading is used.
Having such a great and precise result may be dangerous. Someone might assume that platform showing 120 fps in 4K is able to run 4x4K channels in a single box. Unfortunately this is not true. Cinegy Cinescore provides theoretical maximum the system is able to handle.
Other factors and processes (like input/output pipeline latencies, colorspace/frame size/aspect transformations) will reduce this number. Also not all background processes can be removed from OS, so there should always be a safe buffer in the system performance to handle unexpected processing spikes. So to be on the safe side, you have to reserve some system resources and not expect the machine to run at 100% load 24/7 with no flaws.
Once benchmark results are received, we can try to identify bottlenecks or hotspots – what can be improved to get the best cost/efficiency ratio from the platform. Aside from benchmarking tool, other ones are quite useful.
CPU-Z, GPU-Z, Performance Monitor – just to name a few.
These tools provide you with valuable diagnostic info that helps identify the machine weak points and possible ways to improve them. Having the most powerful and expensive platform is not enough. It should be carefully tuned to get the best possible results for the specific broadcast requirements.
The hardware physical configuration is rather important. For example, a graphics card in the wrong slot may work way not as good as inside another identical platform. Memory channel modes also play a very significant role. With the same amount of RAM the access speed may differ significantly. The same machine was used during benchmarking. The only thing that was changed was memory channel mode – from single to quad channel mode:
As you can see in SD, there is clearly an increase in FPS. The same applies to HD, UHD, and 8K.
BIOS and OS power management features, while being nice in reducing the power consumption and heating, may significantly reduce the platform performance:
Some parasite background processes (like antivirus) can be removed to save another performance bit, however some are native to OS and can be just tuned to run on distributed schedule when the valuable load is decreased. This is the CPU utilization graph captured by Performance Monitor:
It allows us to quickly identify the issue: there was a CPU usage spike at the specified time, possibly due to scheduled background process that caused issues on the playout side.
External factors like network stability also affect system performance. Missing packet or unexpectedly large latency may be visible on-air in some bad scenarios. Reservation is very important here.
Another factor to mention is the platform environment. For example, poor air conditioning may lead to the component heating issues that will affect the system performance.
Please, note that while troubleshooting the issue it is better to avoid several changes at one time. This will mask the real issue reason and the resulting platform may not be optimal. For example, changing the motherboard, graphics card, and CPU may solve the question with performance sometimes, but it might be that only graphics card was faulty and/or inserted into the slot with improper configuration.
At the same time the issue might be caused by a combination of several factors. For example, the CPU/GPU unit is too hot, its performance is auto decreased by the system, and it is not powerful enough. Replacing the unit will solve the issue temporarily until the poor air conditioning will cause troubles again, but not the new unit.