Creating a Serverless and Scalable Render Farm with Dis.co

Raymond Lo

Raymond Lo

February 4, 2020 · 13 min read

In the previous tutorial, we demonstrated how to use Docker to create a portable virtual container that runs Blender (or your own favorite rendering application) on different machines and environments seamlessly. In this tutorial, we will show how you can use Dis.co to automate the job execution and distribution process to parallelize hundreds or thousands of tasks. More importantly, Dis.co also initiates and sets up machines on-the-fly. This enables you to scale your solution quickly with many cloud services such as AWS, Google Cloud Platform, and Packet without ever worrying about any maintenance issues.

When it comes to core features, the first notable one is Dis.co provides Python scripting for distributing jobs across many machines seamlessly. The Python code enables users to create either simple microservices that handle small transactions or sophisticated logics such as running machine learning and model training with Tensorflow. Beyond flexibility, the Dis.co platform natively supports and integrates with cloud services such as Amazon AWS, Google Cloud Platform, Microsoft Azure, and even bare-metal cloud services such as Packet, making it an ideal platform for quick deployment.

 

How Does it work?

Dis.co takes the Docker container image and automatically deploys it to our cloud machines using a ‘smart agent’ and also automatically executes the Python scripts we provided. Think about Python code as the master logic to connect the workflow, and where the Dis.co agent as the “manager” that handles all the virtual machine instance initialization, virtual container setups, executing the tasks, and such. At the end of the process, Dis.co also collects and stores the results on the server for users to retrieve at any time in the future.

Now, if we put all these steps into context, here is a sample workflow diagram that shows how the Blender project is distributed across the machines and how the rendered (final) images are retrieved at the end of the job. Basically, this is a ‘render farm’!

The workflow of the render farm running on Dis.co.

 

Building the Workflow of a Dis.co Render Farm

To get started, let’s clone the Git repository to your local machine with the following command. 

git clone https://github.com/Iqoqo/disco_blender.git

In the repository, we provide all the instructions and source codes that are required to run the Blender jobs on Dis.co. Particularly, let’s look at the following two files that are the core files to the workflow logic.

1. blender_core.py 
2. job_generator.py

First, the file blender_core.py is the logic component that allows us to execute tasks in parallel. On each machine, this Python script will execute automatically in the Docker image discussed in the previous tutorial.

#!/usr/local/bin/python3
import sys
import os
import requests
import pathlib
import time

ROOT_DIR = '/local'
OUT_DIR = './run-result'

#This function sends an image to a dedicated server
#The upload_images.php receives images from a POST HTTP. 
def send_data_to_server(url_post, image_path):
    image_filename = os.path.basename(image_path)
    multipart_form_data = {
        'userfile': (image_filename, open(image_path, 'rb'))
    }
    response = requests.post(url_post,
                             files=multipart_form_data)
    print(response)

#This function executes the Blender project
#range_in range_out are the first and end frame to be rendered
def run_blender(blender_file, range_in, range_out, upload_web):
    start_time = time.time()
    out_path = OUT_DIR+"/frame_#####"
    blender_exe_path = "/usr/bin/blender-2.81a-linux-glibc217-x86_64/blender"
    #we supports Blender 2.81
    command = blender_exe_path +" -b "+ "/tmp/"+ blender_file + \
    " -x 1 -o "+ out_path + " -f " + range_in + ".." + range_out + " > nul 2>&1"
    
    #run the blender command
    print(command)
    os.system(command)
    end_time = time.time()
    exec_time = end_time - start_time;
    print ("Execution Time: "+ str(exec_time))

    #exit if URL is not provided
    if upload_web == '':
        return
    #upload results to the web (optional)
    cur_dir = pathlib.Path('./run-result')
    cur_pattern = "*.*"
    for cur_file in cur_dir.glob(cur_pattern):
        print("Uploading to Server..."+str(cur_file)+"\n")
        send_data_to_server(upload_web, cur_file)

def parse_args():
    in_file = None
    try:
        in_file = sys.argv[1]
    except IndexError:
        pass

    print(f"{in_file}")
    return in_file

# main function that takes the input parameters and process the
# blender project file.
def main():
    in_file = parse_args()

    if in_file is None:
        sys.exit("please provide a Blender batch script file to start")
        return
    URL_blender = ""
    blender_file = ""
    range_in = 0
    range_out = 0

    #the batch file has to have 4 lines
    #URL to the blender project (in zip file with all dependencies)
    #name to the blender file
    #range in
    #range out
    #upload_web
    with open(in_file) as fp:
        URL_blender = fp.readline().rstrip('\n')
        blender_file = fp.readline().rstrip('\n')
        range_in = fp.readline().rstrip('\n')
        range_out = fp.readline().rstrip('\n')
        upload_web = fp.readline().rstrip('\n')
    fp.close()

    #fetch the content from the dedicated URL and extract the package
    print ("Processing "+URL_blender)
    os.system("wget -q -O /tmp/tmp.zip "+URL_blender)
    os.system("unzip /tmp/tmp.zip -d /tmp")
    
    run_blender(blender_file, range_in, range_out, upload_web)
    print ("Blender Processing Completed")

if __name__ == "__main__":
    main()

The script may seem overwhelming at first glance. However, if we break it down, it consists of 3 core functions. 

1. main - handles the data parsing and thus runs the application based on the parameter inputs. Additionally, it also handles the download and extraction of the Blender project
2. run_blender - runs the Blender application in the command line. 
3. send_data_to_server - uploads the result images to an external server for storage.

If you recall from the previous diagram, this Python script thus is distributed across each server automatically by Dis.co.

That’s it for the run script. However, the second question we have to answer is how to determine what runs on each machine. For example, decisions such as the batch sizes (i.e., how many frames to be rendered per machine) and which Blender project to be rendered across the machine has to be made prior to distributing the jobs. To answer that, we will write a job generator that generates these parameters. 

Let’s take a look at the job_generator.py Python file we have posted in the repository. The purpose of this script is to create a set of input files that enable Dis.co to execute our batch jobs.  We can think of each of these files will run on different machines, but all in parallel, and will return back to Dis.co server once it’s completed.

#!/usr/local/bin/python3

import os
start = 0
end = 10
skip = 1
directory = "classroom_sample/"
job_name = "job_"
url = "http://100.25.247.222/uploads/classroom_720.zip"
blend_file = "classroom/classroom.blend"
upload_web = "http://100.25.247.222/uploads/upload_images.php"

if not os.path.exists(directory):
    os.makedirs(directory)

for i in range (start, end):
    range_in = i*skip+1
    range_out = (i+1)*skip
    f = open(directory+job_name+str(range_in)+"_"+str(range_out)+".txt", "w+")
    print("File written: "+job_name+str(range_in)+"_"+str(range_out)+".txt")
    f.write(url+"\n")
    f.write(blend_file+"\n")
    f.write(str(range_in)+"\n")
    f.write(str(range_out)+"\n")
    f.write(str(upload_web)+"\n")
    f.close()

Basically, this script generates a set of data files that will be passed to each machine at runtime when Dis.co distributes the jobs. The parameters start and end defines the range of the input, and the batch size is how many images will be processed within each machine. And lastly, the URL is the link to the Blender project file (as a zip file), where the parameter blend_file is the path to the Blender project file inside the zip file. 

So once you execute the script, you will see a list of text files in the current directory. In a later step, we will upload these to the Dis.co server with either the Web interface or the command-line interface (CLI).

8 -rw-r--r--   1 raymondlo84  staff     126 Jan 28 16:45 job_1_1.txt    8 -rw-r--r--   1 raymondlo84  staff     126 Jan 28 16:45 job_2_2.txt    8 -rw-r--r--   1 raymondlo84  staff     126 Jan 28 16:45 job_3_3.txt    8 -rw-r--r--   1 raymondlo84  staff     126 Jan 28 16:45 job_4_4.txt    8 -rw-r--r--   1 raymondlo84  staff     126 Jan 28 16:45 job_5_5.txt    8 -rw-r--r--   1 raymondlo84  staff     126 Jan 28 16:45 job_6_6.txt    8 -rw-r--r--   1 raymondlo84  staff     126 Jan 28 16:45 job_7_7.txt    8 -rw-r--r--   1 raymondlo84  staff     126 Jan 28 16:45 job_8_8.txt    8 -rw-r--r--   1 raymondlo84  staff     126 Jan 28 16:45 job_9_9.txt    8 -rw-r--r--   1 raymondlo84  staff     128 Jan 28 16:45 job_10_10.txt

Also, inside each of the data files, there are 5 lines of information.

1. The URL of the Blender Project in a zip file format. 
2. The Blender Project file name (including the path) 
3. The first frame to be rendered in the animation sequence
4. The last frame to be rendered in the animation sequence
5. The URL for uploading images to an external server. 

Here is an example of a task file that we will upload to Dis.co. In this case, we will be rendering frames 1 in a batch on each machine we allocate.

http://100.25.247.222/uploads/classroom_720.zip
classroom/classroom.blend
1
1
http://100.25.247.222/uploads/upload_images.php

Next, we can test this solution locally with the Docker image we created previously. 

#run one job locally - this will repeat exactly on disco server
docker run -it -v `pwd`:/local raymondlo84/disco_blender /local/blender_core.py /local/classroom_sample/job_1_1.txt

Finally, we have all the pieces that we need to dispatch a job on Dis.co. 

 

Dispatching your first Blender Job on Dis.co 

With the workflow scripts and in place, now we are ready to deploy jobs on Dis.co. First, we will login to Dis.co with the web UI interface. Sign up here and obtain a free trial account with 500 hours of free compute credits. 

Dis.co Login Screen. (http://app.dis.co)

The first step now is to link the custom Blender Docker image by following the documentation. Particularly, we would like to link the Docker image (e.g., “raymondlo84/disco_blender:latest” from DockerHub) to the Dis.co account prior to executing the job.

Once you link the account,  head back to the home screen and click on the “New job” button. This will bring up this interface which allows you to upload and execute the Blender job. 

We then fill in the “Script” field with the blender_core.py Python script and Data field with a list of task files (e.g., job_1_1.txt, job_2_2.txt, etc…) generated from the job_generator.py previously. 

Next, fill in the “Job title” as “blender_10_frames_disco”, and choose the “Job Size” as “xlarge”, which refers to a VM with 16 CPU cores. We will also check the “Autorun the Job” checkbox to start the job automatically. Lastly, we click the “Create Job” button and we are done. Dis.co will now start taking care of everything! 

From this interface, you will see we have 5 status types: All, Queued, Running, Done, and Error. And once the job started, you can now look at the “Real time log” and see the progress (stdout) of the running jobs.

Finally, once the jobs are completed, you can download the result (as a zip file) directly from the “Done” tab with the Web interface. 

Once downloaded and unzip, you will find 3 files inside. frame_00001.png is the returned output from the Blender project. In this case, it will be the output of the rendered scene. IqoqoTask.stdout.0.txt is the standard output and IqoqoTask.stderr.0.txt is the standard error output from the Python script.  

If you open the png file, you will find the output of the “Class Room” demo scene. As you can see, the lighting of the scene is natural and photorealistic. 

If you are a developer, however, you may want to control this workflow with another script. Lucky,  Dis.co also supports command-line interface (CLI) for developers who want to have finer controls and use scripting as described next. 

 

Running jobs with Dis.co CLI 

In this setup, we are assuming that you have Python 3.6+ pre-installed on your machine. If not, follow the instructions here https://www.python.org/downloads/ to obtain the latest copy of Python. Currently, Dis.co CLI supports Windows, Linux, and Mac OS.

To install Dis.co CLI, we run the pip install command as follows [https://docs.dis.co/quick-start-guide/using-the-cli]

pip3 install disco --upgrade

Once it’s completed, you can verify the installation by typing the command “disco” in the terminal.

raymondlo84@Raymonds-MacBook-Pro ~ % disco
Usage: disco [OPTIONS] COMMAND [ARGS]...

  Root CLI Command

Options:
  -h, --help  Show this message and exit.

Commands:
  cluster     Manage clusters.
  docker      Manage docker images.
  job         Manage jobs.
  login       Login to Dis.co.
  logout      Logout of Dis.co.
  repository  Manage repositories.
  version     Show the installed Dis.co CLI version.

In the previous session, we generated the tasks files (e.g., job_1_1.txt, job_2_2.txt, etc … ).  We can use the command-line interface to now run a new job on the Dis.co server in a few steps. Also, we should have the Docker image setup with “raymondlo84/disco_blender:latest” by following this documentation

1. Login with username and password

disco login 

2. Create and run the job

disco job create -cit l --name "blender_example" -s blender_core.py -i "classroom_sample/job_*.txt" -r

3. Monitor the job with the Dis.co’s view command. The job_id is the ID returned from the “disco job create” command.

disco job view [job_id]

4. Download the results (once the job is completed)

disco job download-results [job_id] -d .

A Powerful Tool 

Dis.co is a powerful tool that can orchestrate large data sets and process computational intense jobs with very minimal learning and maintenance. In our Blender use case, we created the render farm with less than a hundred lines of code, and with merely a single engineer’s effort.

By default, Dis.co provides virtual machines up to 16 cores and 16GB of ram. In many use cases, that would be plenty of processing power to get ‘most’ of the work done. However, in the rendering world, that is barely the tip of the iceberg when it comes to daily workloads. The next urgent question will be can we do better? And how? 

In the next blog, we will discuss the integration of the Packet platform with Dis.co to extract the best performance out of a system with bare-metal machines that support over 24 cores (48 threads), and 64GB of RAM. With such powerful machines (at least by today’s standard in 2020), we can now easily cut down our rendering tasks from days to hours by parallelizing the tasks across dozens or hundreds of machines. For example, the following path tracing movie now takes less than 30 minutes to render by distributing the jobs across 29 machines with 48 CPUs per machine = 1392 CPUs, instead of waiting over 1 day on a workstation. Most importantly, that’s something we can do with the click of a button. Amazing! Stay tuned for the next tutorial.

To see how Dis.co can accelerate video rendering for your agency or company, please visit try.dis.co/render.

Raymond Lo

Raymond Lo

February 4, 2020 · 13 min read