How can I get HitFilm to keep use all available CPU capacity?

DannyDonkey Posts: 11 Just Starting Out*

I'm working with simple stuff where I makes videos by combining screen and audio records. The screen record output file is mp4 , which I know is an inefficient editing format in HitFilm. It both does and should use all my  4,34 GHz 8-core CPU juice. However, very often it goes down to  25% use, leaving 70% unused, and of course lags like it's installed on an underpowered system. How can I keep HitFilm use all available CPU-juice when it indeed is needed by the application anyway?


  • Triem23
    Triem23 Posts: 20,288 Power User
    edited May 2017

    Long story short, you probably can't, since the CPU in Hitfilm only does codec compression/decompression, disc I/O and physics calculations for particle sims. In Hitfilm the GPU is doing all rendering tasks and most of the heavy lifting. 

    So, what's your GPU? 

    And I'll tag @NormanPCN to tell you the other reasons you've asked the wrong question. 

  • DannyDonkey
    DannyDonkey Posts: 11 Just Starting Out*

    GeForce GTX 970.

    I really want to understand what I do not yet understand about how HitFilm works.

  • Triem23
    Triem23 Posts: 20,288 Power User

    No problem. We're glad to help. I tagged Norman as he's more knowledgeable about system resource usage than I. 

  • chibi
    chibi Posts: 255 Enthusiast

    I've posted about this many times. When encoding video in hf its just slow because its not using all cpu cores. Hopefully in the next version they will improve multi-threading.

  • Triem23
    Triem23 Posts: 20,288 Power User

    @chibi I believe Hitfilm uses multi-threading, BUT one thread per video stream. 

  • [Deleted User]
    [Deleted User] Posts: 1,994 Just Starting Out
    edited May 2017

    HitFilm's H.264 encoder is multi-threaded and will make use of all cores and hyper-threading if available.

    The reason the CPU Usage doesn't reach (or get close to) 100% is because HitFilm does all of its timeline rendering on the GPU.  No matter how simple or complex your timeline is, it gets rendered on the GPU.

    The exporting works like this:  Render frame on GPU; encode it on CPU; render next frame on GPU; encode it on CPU.  And so on.  At periodic intervals a bunch of encoded frames will also be written into the file container on disk.  The encoding threads have to wait on the next frame to be rendered by the GPU before they can actually encode it.

    The speed of the GPU (not just in terms of processing power, but also how quickly the driver can upload textures to the GPU, and read textures from the GPU) will directly impact the utilization of the encoding threads.

     I haven't found an answer about pre exporting, but there you go.

  • Triem23
    Triem23 Posts: 20,288 Power User

    Anyway this all comes down to the CPU waiting for GPU activity, hence, CPU utilization is Hitfilm is usually pretty low. 

  • DannyDonkey
    DannyDonkey Posts: 11 Just Starting Out*

    So usually a faster and more high-end GPU will render and do stuff faster?

    Thanks for the answers, everyone. :)

  • [Deleted User]
    [Deleted User] Posts: 1,994 Just Starting Out

    I would love to see a clarification on this, but at the same time if there are weak points, even if they're not major, any negativity is bad because it's a very competetive business. They're an amazing team, and they make amazing software, so I think we should leave it at that, not to sound completely disrespectful to you. :)

  • NormanPCN
    NormanPCN Posts: 4,373 Expert

    Tag I'm it.

    As Triem stated. Hitfilm only uses CPU for certain things. Video decode is a primary one. The other is video encode. So during playback subtract video encode and we only have video decode using CPU. Also remember that during playback speed/utilization is clamped to the frame rate of the timeline. Simplistically speaking on a 30p timeline if Hitfilm is able to sustain 60p then utilization will be about half.

    The only way video decode can really load up the CPU on HD material is to composite multiple media streams at any given point in time. UHD only take a couple of media streams. This taken from tests on my 4C/8T CPU (4770k).

    If you are doing pure CG stuff you are never going to see playback jack CPU above 1.x threads. On my 4C/8T machine I typically see ~15% in such cases. 12.5% being on core/thread or the 8 fully loaded. There is CPU code that needs to tell the GPU what to do.

    Media decode is multi-threaded. It seems a thread pool is created to service all decode. On my machine 4C/8T I see 8 threads created for decode. How these threads are used is indeterminate. In various circumstances/tests I see threads terminate with no execution time and others all threads get execution. Even though threaded media decode may not be asynchronous to the timeline.

    The AVC encoder (aka MP4) is multi threaded but the work it does is dependent on Hitfilm feeding it frames to encode. This is true of any encoder. We cannot really tell if the AVC encoder is asynchronous with the timeline. When the encoder is waiting for Hitfilm to give it a frame then those threads are idle.

    Generally speaking, Hitfilm is not as efficient as other NLEs. Vegas/Premiere/FCPX. In basic playback this shows up a lot with AVC media which has a high decode overhead. It is not that Hitfilm is bad with AVC and good with others. The performance diff is the same with the other codecs. It is just that if the app makes realtime performance we generally don't care.

    At times when Hitfilm is working as hard as it can it still does not utilize CPU/GPU resources as well has other NLEs. This is very variable and depends on circumstance. I think Hitfilm might be doing some things synchronous in internal operation, whereas other NLEs might be async. This can affect performance.

    GPU readback performance on Hitfilm is pathetic. During playback this does not come into play. However, during a RAM preview and during an export (encode) it very much affects performance. It slows down, stalls, the timeline and therefore slows down encode. It also causes lower utilization. Some effects have portions or all operations done on CPU, therefore their performance is affected by this. GPUs can readback fast as hell. It has been suggested GPUs are the fault here, but this is a croc. However, Hitfilm is using OpenGL and GL is probably a strong contributor here. Is GL a minority or a majority % in this performance issue is unknown to us end users. NLEs that readback fast a hell are using OpenCL/CUDA.


  • DJNoise
    DJNoise Posts: 1 Just Starting Out

    Hello Guys & HF Devs

    Sorry for pick up this old thread

    But rendering in Hitfilm in my case Express with an M1 mac 64GB Ram is absolutely horror

    I create a Audio Visualizer with motion blur and depth of field on.

    for a 5:22 min video i calculate the rendering is finish in 17hours 🙀

    That is crazy the GPU is almost 100% utilized. But the CPU only needs 120% if I compare it to Handbrake, for example, which needs 6-700%, I call it extremely weak.

    I've read that everything from the timeline only goes to the GPU. But why they program it that way???

    It would be better that GPU and CPU were combined, so everyone would be happy. Or at least a selection in the settings.

    It's also a pity that the app itself is still a Intel and not yet universal for M1. This could also increase performance.

    I hope that the developers at HF will recognize this and change it because I am extremely disappointed the way it is now.

    Cheers and happy rendering 🙈

  • triforcefx
    triforcefx United StatesPosts: 1,641 Moderator

    @DJNoise While it feels like it’s been out for a while, M1 is still a very new chip architecture for Mac. There is still a lot of optimization that needs to happen… and it’s still not officially supported by HitFilm even though they have started fixing some M1 bugs…. So the fact that it works at all is pretty impressive.

    There are a few reasons that M1 isn’t fully supported (and why they haven’t made a native port):

    Video editors are some of the most complex consumer software there is, and the code is written specifically for the x86 hardware it runs on. Most other software is very “high-level” and can ported to other architectures with relative ease. But high-level doesn’t work when performance and specialty hardware-specific code is required. Large amounts of code need to be completely rewritten, which takes lots of time and testing… even for software companies with literally 100x the developers as HitFilm.

    HitFilm also has a number of dependencies that simply don’t exist on M1 natively. OpenGL is a big one… HitFilm is written for OpenGL, but Apple wants everyone to move to their Metal API and refuses to properly port OpenGL to M1. This is a major issue, which would require large amounts of code to be rebuilt from the ground up. Because Metal isn’t cross-platform, they would then have to support 2 completely different codebases… one for Mac and one for PC, as opposed to 2 relatively similar codebases that they have now.