REDLAB’s VFX Team Powers Up with Qumulo Intelligent Storage


Post production company REDLAB in Toronto serves producers and agencies working on digital projects that range from commercial projects to feature films, episodic programming to high-definition streaming video at up to 8K. REDLAB’s clients include Range Rover, Fujifilm, Shopping Channel, Walmart, Honda and more. 

The company is growing – in terms of the number of projects coming through the facility, the scope and complexity of clients’ projects and the amount and diversity of data that each one involves.  

REDLAB has been carrying out visual effects work for clients since it was launched, but is now interested in expanding the department and investing in new systems to make VFX a core service. “As we have continued to take on more VFX projects, we realised that we need storage that can meet the performance and scalability requirements,” said Will Garrett, VFX Supervisor at REDLAB.

More Than Capacity

But the VFX team needs more than capacity to store clients’ video. They also need to know how different kinds of data are being used at the studio. They are now using Qumulo workflow storage to support their VFX services. Will said, “I’ve had previous experience with Qumulo at another studio, where I appreciated its ability to scale, manage the network and allocate space. Equally important, Qumulo's file system is able to output real-time data analytics.

“Right now, 75 to 80 percent of our projects are produced at 4K and we are starting to look at workflows for 8K and HDR projects. We have been working on more prime time TV shows, not only commercials, and Netflix and BBC movies that need animation and simulations. We’re using Qumulo mainly for VFX work using 3D software, including Houdini. We run a 20-node 5-blade render farm supporting a VFX team of about 12 artists.”

Qumulo Capacity Trends

Because VFX is a file-based workflow, and files are the means of exchange between different software applications, workflows must also integrate across applications via files. Thus, the file system is critical to the VFX storage environment and must address performance, scalability, adaptability and visibility.  

Scalability and Performance – Qumulo’s File System

Systems that are too slow either hold up the render farm or keep artists from working while rendering proceeds. As studios move to higher resolutions, performance will become increasingly critical.

Qumulo’s file system has a distributed architecture in which many nodes work together to form a cluster with scalable performance and a single file system. Qumulo clusters form a highly connected storage fabric tied together with relationships based on continuous replication, Qumulo’s redundancy functionality that supports scaling. Users interact with clusters using standard file protocols, a REST API and a web GUI for administrators.

Continuous replication saves multiple versions of files behind the scenes as artists work. The process, which occurs as often as practical without impacting cluster performance, creates a copy of the data in a directory on the primary cluster, and transfers it to a directory on a second target cluster. Qumulo’s software looks for recent changes and replicates them automatically. Using snapshots, continuous replication generates a point-in-time consistent copy of the source directory on the target cluster. Restoration is then a process of re-building to avoid loss.

Virtualised Block Layer

Qumulo REDLAB2

Qumulo’s FS is also modular - as demand increases on a cluster, you can add nodes or instances, and meanwhile capacity and performance scale in a linear fashion. When dealing with very large numbers of files, sequential processes are not computationally workable. Instead, the Qumulo FS uses parallel and distributed algorithms for querying and management, and sits on top of the Scalable Block Store, a completely separate, virtualised block layer where tasks such as protection, rebuilds and allocating data to particular disks are handled. Without this block layer, protection would have to occur one file at a time or use fixed RAID groups, which can slow the system down.

“We have been able to align Qumulo’s functionality with the way we need to use storage,” said Will. “For example, our previous system began to slow down whenever we needed to add more users, and render time became hard to manage, reaching a bottleneck. Qumulo, on the other hand, has quota management and granular access to certain users. Backing up is rule-based and selective - when retiring projects, you can keep them centrally located, or customise a utility that excludes irrelevant material.”

Qumulo aggregates metadata in real time, which it uses to set real-time capacity quotas. These are deployed, updated and enforced immediately, and do not have to be provisioned. Quotas assigned to directories move with them and the directories can be moved into and out of quota domains. This makes it unnecessary to divide the file system into volumes. Because the built-in aggregator continually keeps the summary of the total amount of storage used per directory up to date, the real-time quotas are precise and agile.

Visibility and Smarter Analytics

Visibility is a limitation of legacy storage systems. Until recently the scripts or programs supplied to analyse and manage storage systems have been so slow that by the time answers to queries were found, they were out of date. Qumulo also believes artists need to see data clusters as a single volume, rather than dealing with multiple disks or provisioned volumes. IT administrators need real-time visibility to gain control over the situation at any moment, down to the file level, to identify hotspots, follow throughput trending and immediately apply quotas. But they also need to see who uses the most storage over time so that they can plan ahead without overprovisioning.


Qumulo Core analytics

One of Qumulo’s first products was Qumulo Core, Linux-based software responsible for detailed data analytics presenting an accurate picture of what’s happening in a file system at any time - which pieces of data are being used, how often each piece of data is being used, and which applications are using the most data. This information helps companies make decisions about moving and archiving data, and is accessible either through the web GUI or by using the REST API to automate access.

The metadata that Qumulo aggregates about the file system as changes occur, such as bytes used and file counts, is stored and kept up-to-date in directories that are readily accessed for processing. The performance analytics seen in the GUI and extracted with the REST API are based on probabilistic sampling mechanisms built into the file system, made statistically valid because the continuously updated metadata summaries allow the sampling algorithms to give more weight to the larger directories and files.

Without that intelligence, representing every throughput operation and IOPS within the GUI would be impractical in large file systems. Totals for IOPS read-and-write operations, as well as I/O throughput read-and-write operations, are generated from samples gathered from an in-memory buffer of about 4,000 or more entries, updated every few seconds.

Hot Data, Cool Data

Will said, “REDLAB doesn’t use cloud storage yet but we are making plans to use it depending on control and security, data and transfer rates in and out of the system. Right now we are moving data around a lot, differentiating between hot data that is actively in use and cool data that is used less often, and we want to refine the available tools for a more intelligent approach to using cloud storage.”

The Scalable Block Store mentioned above includes built-in tiering of hot and cold data to optimise read/write performance. When running on-premise, as it does at REDLAB, Qumulo hardware combines SSDs, which run faster, with HDDs, which are cheaper to run. SSDs are matched to each HDD on each node, forming a virtual disk, and data is written first to the SSDs.

Qumulo REDLAB3

Because reads typically access recently written data, the SSDs also act as a cache. When the SSDs are about 80 percent full, less frequently used data is expired, that is, moved to the HDDs which manage capacity and sequential read/writes of large amounts of data. Metadata, however, stays permanently on the SSD. The 80 percent level optimises performance - when data from an SSD is moved to the HDD, SBS handles the writes sequentially to HDD in a way that optimises disk performance.

“Qumulo’s caching system is valuable to our VFX team because it can improve the performance of recently or frequently accessed data by storing it temporarily in fast storage media, local to the cache client and separate from bulk storage. The cache at REDLAB can become huge at certain stages due to simulations and composites,” said Will. I/O operations use three different types of cache – some on the client side and two types on the nodes that share memory. One is the transaction cache holding the file system data that the client requests, and the other type is the disk cache that keeps blocks from that disk in memory.

Qumulo Future

REDLAB is currently developing new workflows and processes based on how their artists’ software can connect to and take better advantage of Qumulo storage. To do this they are setting up queries and customising the system using Qumulo’s REST API.

The REST API makes it possible to control and inspect the file system with command-line tools, the UI and automated tests. All the information represented in the Qumulo GUI for administrators, for example, is actually generated from calls to the Qumulo REST API. A tab within the GUI serves as a resource documenting the available REST API calls.

Also, all of the information presented in the Analytics tab in the GUI can be retrieved programmatically by making REST calls against the API, and stored externally in a database or sent to applications such as Splunk or Tableau that visualise and analyse data. You can also initiate most file system operations with the REST API.