1. Trouble with the game?
    Try the troubleshooter!

    Dismiss Notice
  2. Issues with the game?
    Check the Known Issues list before reporting!

    Dismiss Notice
  3. Before reporting issues or bugs, please check the up-to-date Bug Reporting Thread for the current version.
    0.30 Bug Reporting thread
    Solutions and more information may already be available.

threadripper bottleneck???

Discussion in 'Troubleshooting: Bugs, Questions and Support' started by gameboy3800, Jul 30, 2018.

  1. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    Hello. I play BeamNG a lot for my youtube channel. I'm aware that it's a very core heavy game because of how it handles real time soft body car physics. so i've built the system you see in my signature. one thing i've noticed that even with my 1950x and vega 64, sometimes i still dont have high framerates. the processor is only showing about/less than 20% utilization, and the graphics card only shows about 50%.there are some maps that have a lot of terrain, and they would use the gpu to its full potential. but i've never seen my processor get anywhere near full utilization even with 64 cars all running at once. i've moved the game install and all the mods i use to my nvme boot drive, so i don't think its looking for files and getting bottlenecked that way. is the game somehow not optimized for very multithreaded processors despite being a very core heavy game and having a 6core intel as recommended?

    I believe a good test you could do to try to help out is to get the roane county tennessee map and spawn in the burger part of it, and post of performance. with max settings 1080p i was barely hitting 20fps. i've tried upping the resolution to 1440p to see literally no change. i even dropped to 480p. also no change. disabling shadows brings me to 22. disabling dynamic reflection brings me up to 40frames. disabling both brings me up to 70frames, but my processor still only shows 12% and gpu only 65% even with ReShade color boosts.

    I know my processor is probably overkill. I do not expect to see it ever get maxed out by any game. but even with 64 cars loaded up at lowest settings at 720p i was getting at best 15fps and only had 65% of my processor being used.

    has anyone else experienced their computer just not using its full power to play beam? is there a fix i can do? any help would be of great assistance. thanks in advance.
     

    Attached Files:

    • lowfps.jpg
    • 140ptst.jpg
    • 480sts.jpg
    • BeamNG2018-07-3004-36-36-75.jpg
    • noreflect.jpg
    • noshadows.jpg
    • noreflectnoshadows.jpg
    #1 gameboy3800, Jul 30, 2018
    Last edited: Jul 30, 2018
  2. Nadeox1

    Nadeox1
    Expand Collapse
    Spinning Cube
    BeamNG Team

    Joined:
    Aug 5, 2012
    Messages:
    14,683
    Percentage of utilization are not a good way to measure stuff.
    The game and system will use hardware depending on what's needed at the moment.
    If your CPU usage is low, it pretty much means that something else is bottlenecking the system (also note that map is pushing things to its very limits)

    A better way is to use the in-game 'PERFORMANCE' tool: It's in the side menu ('Advanced Functions' must be enabled). Press that, press the HIDE UI button on top, and when it appears again take a screenshot and post here.
    Do that on gridmap and that map.
     
  3. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    only thing not shown in these screenshots is cpu speed. but i can assure you it's locked to 4.0ghz across all 16 cores.
     

    Attached Files:

    • BeamNG2018-07-3005-42-10-59.jpg
    • BeamNG2018-07-3005-43-30-95.jpg
  4. fufsgfen

    fufsgfen
    Expand Collapse

    Joined:
    Jan 10, 2017
    Messages:
    6,782
    Your single core performance is limit right there. That is Bob's Tennessee map, in there number of objects make huge load for single core, some physics stuff runs on that core as well as some graphics stuff and as number of objects somehow cause small CPU load each when using shadows, that one core will get overloaded.

    When that core gets overloaded, GPU has to wait until that specific CPU core can process data for GPU to render. Of course there are buffers and certain amount of peaking is not noticeable, but at some point GPU starts to wait enough for noticeable fps drop.

    Now when you have reflections set to so unrealistic high (when it was last time you saw some ground objects being reflected from car's paintwork 1000 meters away?) that single core CPU load multiplies roughly by 6, shadows multiplies that with who knows how many.

    Also nature of that CPU load is such, that it spikes very quickly, so it is very hard to spot with a task manager, as it is too slow.

    There is bit more about it, even more is in my signature performance testing link:
    https://www.beamng.com/threads/cpu-single-core-load-gets-bit-high-at-eca.56637/

    It has been like that as long as I know, but in 0.13 it is perhaps even more noticeable as something more is done on that single thread/core on physics side as pressing J to pause physics boosts FPS.

    I don't know what is single core Cinebench result of threadripper, how it compares to Coffee Lake I5, which I guess is fastest single core performance holder currently? Crazy part of this is that it could well be that such I5 could actually be faster.

    That single core load is and has been single greatest thing limiting performance of Beam as long as I have known BeamNG.
     
  5. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    in my experience the only time pausing physics has ever made a difference for my threadripper is when i had it paused to spawn 100 cars just for fun. unpausing crashed game. when i tried with 64 cars there was barely a difference between paused vs unpaused. my single core cb score is 162, which is the right there with an i5 8400. an 8700k gets 195 single core cb.

    i asked a friend to test the roane county map the same way i did, and he got exactly the same fps as i did while having an i5 4690k and gtx 750 ti. cpu was higher utilized obviously because its only a quad core. but the gpu, despite being significantly lower in maximum performance, was still only pushing 65% (i know percentages arent a good way to measure but still) i thought it was intriguing. that means that that map somehow pushes the game engine that much.
     
  6. fufsgfen

    fufsgfen
    Expand Collapse

    Joined:
    Jan 10, 2017
    Messages:
    6,782
    I think my 6700 got around 164 in cb R15, I would think 8700k should be higher as 6700k got around 195 and Coffee lake should be around 20% faster in single core performance? Who knows about those really.

    Total CPU load is not telling much indeed as if single core is maxed out, you get only small percentage of total CPU load.

    Then if software swaps really lot of stuff in memory, that can cause memory bandwidth becoming bottleneck, which does not show up in individual CPU core loads, but causes GPU load to drop as CPU load drops and nothing happens as everything waits for memory.

    I'm not sure if there is a tool to find out about that, I don't think that even HW Info is showing that, can't remember now.

    Oh yes your Tennesee map graph screenshot is missing CPU graph form top, it might be interesting one to compare to gridmap.

    People laughed at my 6700 and GTX1050Ti, how that GPU would be bottleneck, but more times it has been CPU single core performance that has been bottle neck, with Automation cars GPU starts to be bottleneck though.

    Also I don't think that pausing physics did anything in previous versions, but with 0.13 there seems to be clearly a bump in fps, but that depends so much from graphics settings. However at Tennesee it is mesh road objects that are bringing FPS down, same happens in Glass view mountains map (one with yellow autumn trees as there are two of same name maps).

    Of course sheer size of Tennesee is bit much for the game engine, that is made mostly for 2km by 2km maps.

    I don't know how much devs can do to improve utilization of hardware, it might be engine related, it might be that it is not possible to separate those areas of code to another thread, because you can't multi thread parts of code that needs to be run in sync, afaik.
     
  7. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    my threadripper uses quad channel memory across 8 sticks. so i wouldnt think its a memory bandwidth problem. my threadripper is still a good choice in my eyes because its gaming performance isnt cripplingly bad or anything like that, and its a rendering powerhouse.

    yes an overclocked 6700k can hit about 200 single core cb. honestly your setup is pretty good. 6700 is still a great processor, and 1050 ti is about as good as you can get perf/dollar wise.

    i was just wanting to know if there was anything i could do to my system to help it be better utilized. if its game engine limited, then that's that. maybe i'll build something with the new 8 core coffee lake refresh chips for lols just to see how it runs beam. who knows, maybe it'll be the most ideal setup.
     
  8. fufsgfen

    fufsgfen
    Expand Collapse

    Joined:
    Jan 10, 2017
    Messages:
    6,782
    8 core build would give interesting data for comparison, putting comparison up to YT might cause quite big war though, one camp or another would be upset, no matter if both builds run everything just fine :D

    I have 4 sticks DDR4 2666Mhz, total of 16GB, if Beam would have memory bandwidth issue, then I would have it worse than you and I don't see such, so you are correct, there is probably not such.


    Things to try.

    Have you tested disabling hyper threading equivalent? There used to be some issues with that causing slowdown, but I'm not sure if it has been fixed already, I think it was some kind of windows issue, but might be worth to test.

    Shadow settings to partial and mostly keeping dynamic reflections off or under 300 meters and faces per update bit down, I doubt there is much else to do without sacrificing visual quality.

    Shader and light quality has an effect, but only when going quite low. Getting better than 60fps reliable is bit of a challenge in current version because of single core limits, that is what I have found, but need to examine more with different GPU I get to borrow later in the week, if I'm not getting boost in FPS at problematic areas, then there is little to upgrade, except K series CPU, that would give me performance boost.
    --- Post updated ---
    I did test on high setting, with shadows and without shadows, I had no checkboxes checked and even resolution is bit less than 1080p, but my GPU is bit slow, however I did then set everything to max and windowed mode window as big as it wanted to go, which is close to 1080p, got 55fps, with 1050Ti and 6700 non K, it is indeed bit odd how @gameboy3800 system gets so little more.

    My LUA delay is less, but I really can't tell anything else from these graphs, I have more green on CPU graph as excepted from slow GPU, he has more pink which is othersFrameMs which does not tell anything to me, yet he has lower CPU delay and GPU delay, but about 10fps more.
    GM_max_all.png

    I'm quite certain there something odd in this mystery.
     

    Attached Files:

    • GM_no_shadows.png
    • GM_shadows.png
    • TN_no_shadows.png
    • TN_shadows.png
  9. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    the main thing with threadripper is that its essentially 2 cpus on one pcb, and they each have to communicate with eachother. that does add some latency. the old ryzen issue of simultaneous multi threading/hyperthreading has been fixed more or less in windows thanks to updates, and keeping the system in high performance power plan has always reduced latency and scheduling issues.

    i've heard of some old maps that dont play right with radeon gpus. not sure if that still applies though, as when i had a 1080 ti for a couple days i saw almost no fps or performance difference at all between it and my current vega 64 card on a map that had 3 buses, 3 car stacks, one camper and one ambulance. thats possibly the most demanding course i've driven on regarding both the cpu and gpu, and still both the 1080 ti and vega were still only topping out at 80% utilization. cpu i believe was at roughly 30% iirc.
     
  10. fufsgfen

    fufsgfen
    Expand Collapse

    Joined:
    Jan 10, 2017
    Messages:
    6,782
    Okay, I tested 1080 and it is as I was suspecting, CPU single core performance is holding me back even with 1050Ti, so it is no wonder you will not get more FPS as your CPU is similar in single core performance, which means, no matter what monster GPU we slap in, there will be no more FPS.

    All details maxed out, except dynamic reflections toned down, I'm getting barely over 50fps at ECA town with D15 chase cam, more test results here:
    https://www.beamng.com/threads/cpu-single-core-load-gets-bit-high-at-eca.56637/#post-901996

    At some point devs will improve that aspect, I'm sure, but for now, dynamic reflections are better off, shadows at partial and shader as well as light quality at normal instead of high, those from my memory are settings that affect single core CPU load and as now there seems to be more physics workload on same thread it is getting tad high.
     
  11. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    i will try these settings. thank you.
     
  12. fufsgfen

    fufsgfen
    Expand Collapse

    Joined:
    Jan 10, 2017
    Messages:
    6,782
    I did also notice that SSAO seems to have some CPU load and DOF, Lightrays and AA seem to have bigger effect on FPS than usual, like if all of them would use tiny bit of CPU time and together that starts to show up.

    Haven't tested adjusting post processing, it might be that it is one adding CPU time and those others just multiply it, or something, but setting post processing to normal has quite big effect on visual quality.

    Need to figure out more about best balance of visual quality vs CPU load to really figure all these out in 0.13, but it is bit funny how one has to balance CPU load by graphics settings, instead of setting graphics setting to what GPU could do :p
    --- Post updated ---
    Oh and I just noticed that depending from settings and what vehicle you have, UI can rob as much as 7fps, maybe even more, because I had 53FPS at WCUSA with a bus, I did hid UI and got 60FPS, but I had vsync on, so it might of been even bigger.

    So hide UI for extra FPS!
     
  13. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    dont run with vsync. it causes stutters. i always run with hud disabled.
     
  14. fufsgfen

    fufsgfen
    Expand Collapse

    Joined:
    Jan 10, 2017
    Messages:
    6,782
    Oh no, running vithout vsync causes stutters in Beam for me, with Vsync silky smooth, as long as you keep that 60fps and not go under it.

    Stuttering with Vsync occurs only if you ask too much from hardware so that FPS can run only 59 or so, because then Vsync drops to 30 from 60, when you set graphics so that your GPU never runs 100% and CPU is not maxed out, with VSync there are no stuttering.
     
  15. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    having used both freesync and gsync, adaptive refreshrate technology is amazing. too bad gsync monitors are super expensive.
     
    • Like Like x 1
    • Agree Agree x 1
  16. fufsgfen

    fufsgfen
    Expand Collapse

    Joined:
    Jan 10, 2017
    Messages:
    6,782
    I did some tweaking, Cinebench score is up to 174, userbenchmark shows CPU is getting practically same SC results as 6700K and claims this CPU of mine is 100th percentile. CPU-Z test puts single core result right at 4790K level.

    Now I'm getting around 50fps with I7-6700 & gtx1080, but I have no reshade, not sure if that would use CPU cycles though, but probably next week I should get 8086K and it will be interesting to see if that will improve things bit more. For a 6700 non K, I guess that is as good as it will get.
    upload_2018-8-31_5-42-34.png



    But try this on above scene, press shift-c for free camera, make sure camera angle is not changing, observe what happens with FPS :) :) :)

    I get ~7fps more with that, can be up to 10fps from lowest to max.
    Kinda difficult to figure out logical explanation for that, 7fps from 50 to 57 is quite a few percent, yet nothings else changes, camera is just not hooked to vehicle anymore. Works with orbit camera only, I think, there is no big change with other camera modes.
    upload_2018-8-31_5-50-49.png
    It is really happening, at least on some maps, but noticeable when badly CPU limited.
     
  17. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    i don't seem to get any difference in fps regardless of camera angle or whether its attached or not.
     
    • Informative Informative x 1
  18. fufsgfen

    fufsgfen
    Expand Collapse

    Joined:
    Jan 10, 2017
    Messages:
    6,782
    Not like any at all?

    My FPS changes a lot when changing camera angle or looking different directions at that place.

    Now if your's does not and considering your FPS is so much lower what I'm getting in same conditions, it could indicate that there is some FPS limiter kind of thing acting up?

    Sure your single core performance is bit less, but it is not that huge difference, under 10% while in fps difference is so big, but of course it can be that even little more in CPU results bigger percentage change in FPS as GPU can do more work, so relationship of CPU single core vs FPS might not be linear, however I would still except you to have different FPS with different camera angle, it is not huge for me, only about +/- 5fps.

    Shift-C does change fps more, maybe it is intel/nvidia thing, if game uses less cores in freecam mode, then turbo clock could have something to do with it, actually need to check that, but here is video showing how my fps changes with freecam and orbit cam, while camera angle etc. is not changing:


    If you press ctrl-f twice, you can see drawcalls and at the bottom shadow drawcalls, when drawcalls start to go over 3000 CPU usually starts to have trouble, then it varies greatly between maps that how many polygons they manage to put out with those 3000 drawcalls.

    Still there is something more to it as some maps do manage to do just fine with higher amount of drawcalls, but generally keeping those low allows more use of available GPU power and threaded CPU power which there usually is plenty of.

    As an example, I crafted a little test, over 42 000 000 polygons with only 3245 drawcalls:
    upload_2018-8-31_15-42-27.png upload_2018-8-31_15-42-42.png

    As I was adding more meshes to scene, I did pay attention to CPU load and after beginning it never became a limiting factor (even JRI is CPU limited for me, especially with 3 cars behind those buildings), but if I would have 100 textures for each object, then CPU would be on it's knees very quickly as each texture adds drawcalls.

    Still I'm not sure if that can explain it all as it is combination of different things, but generally least amount of materials and textures will give best chance for Thredripper to do it's thing, combined with really powerful GPU it is possible to make it run huge grid of cars on map, but map needs some tricks to get best out of it.

    Jungle rock island is really good in this aspect.

    However vehicles also tend to cause load on same thread, something to do with materials and layers or something of that nature I guess, keeping within same vehicle might allow more vehicles to be run, not really sure how well engine handles that, but usually game engines can bundle objects together that are clones of each other and at least on physics side BeamNG does this too.

    With DirecX12/Vulcan, I would guess Threadripper gains a lot of performance, so future games will probably use Threadripper much more efficiently. Just Cause 3 uses all 8 thread of mine evenly, not sure if it uses all Threadripper's threads, but then again it is only about 50% on 8 threads, but going for 144fps it would probably benefit from more threads, maybe, if it can use more than 8.
    They have balanced that game quite nicely, even DirectX makes this drawcall restriction, I guess it is possible to get around it with clever methods when building graphics.

    Also T3D adds to issue with it's expensive shadows, which limits again drawcalls etc. BeamNG has heavy UI currently which does not help, so that is why experience is different with BeamNG, I believe.

    I set shadows partial, dynamic reflections off and 95% of maps run at solid 60fps, on most I can keep dynamic reflections on too, with some maps I can set those to maximum, it really depends how map is made, zillion pieces of stuff is not a problem if they are copies of each other, but if each bit is individual and has own texture maps or even several per object, that seems to correlate to this lower performance than excepted issue.
     
  19. gameboy3800

    gameboy3800
    Expand Collapse

    Joined:
    Apr 27, 2014
    Messages:
    787
    i have vsync, smoothing, and any other frame limiters completely disabled. absolutely no change. i've locked all 16 cores of my processor to 4ghz for more overall performance. without it it could turbo up to 4.15 or 4.2ghz on up to 4 cores only. ok in some games. but workstation tasks like rendering would be crippled from its slower all core turbo speed.

    my threadripper is a workstation processor first, gaming second. i've been aware of this since it was announced. i do not expect it to beat any intel chip in raw gaming performance. but to see it this underutilized in a game that's very physics intensive the more you show on screen just confuses me.

    screenshot1: jungle rock island 100% max settings 1080p with full reflections. graphics card is fully utilized. ~52fps. reshade is still on boosting colors. reshade is a post process thing; it only leads to some more vram usage, not power intensive. when paused vs unpaused there's only 1fps gained. without reflections i double fps and hit 102 - 103fps depending on whether paused or unpaused. cpu only topping out at around 8% in this scene. no fps change whether camera is connected or disconnected.

    screenshot2: same place, but upped poly count by having 6 full cars in the shot. reflections once again at full. fps is halved here at only 25 from the 50ish from before. graphics card utilization dropped to 75%. disabling reflections once again brings a huge uplift of about 2x getting me to the high 40s. once again camera position (as long as i look at the cars) or pausing the game has no effect on fps. looking away brings frames up to 90. cpu now shows itself at 15%.

    notice that in both pictures, i do seem to have high draw call amounts. could this be the threadripper/ryzen achilles heal?
     

    Attached Files:

    • BeamNG2018-09-0100-31-39-61.jpg
    • BeamNG2018-09-0100-44-58-30.jpg
  20. fufsgfen

    fufsgfen
    Expand Collapse

    Joined:
    Jan 10, 2017
    Messages:
    6,782
    Yeah, drawcalls are a problem, they are problem even for fast intel chip.

    Each car's graphics have something that uses only 1 of your cores, all cars, dynamic reflections then multiply this effect, those dynamic reflections can take 30% of that single core's computing power, which is already taken by car graphics, shadows, UI, there is only so much of what single core can do.

    Drawcalls are something that seem to be good indication of what is within possible limits of single core of the CPU, it is not perfect as there are other stuff that loads that same CPU core, but it does give indication if one is going to lag zone.

    While physics of this game would allow me to run probably 10 cars no problem, this CPU graphics part is limiting car number to 2 on WCUSA and 4-5 on Jungle Rock Island, if I want to keep 60fps, GPU would handle more, but that one core of CPU cannot.

    It is probably partly because of DX11 is not very threaded on some areas, but also because on some parts modified Torque 3D game engine that BeamNG uses is putting extra load on that same CPU thread.

    So I did order 8086K to be able to run bit more cars, but even with that I suspect to see single thread limiting performance more than all cores would allow or what GPU would allow.

    I would guess situation will not remain like this forever though, considering most of the CPUs are same level of ours (we have only about 5% difference or less in single thread performance) in this single thread performance and that is limiting amount of vehicles currently more than multicore performance, devs probably have great interest to get improvements for this.

    New UI might be something that helps, at least a little.

    Meanwhile setting shadows to partial and keeping dynamic reflections distance and faces per update low or dynamic reflections off are pretty much only things we can do while maintaining most of the visual quality.

    In other games and especially in coming years, Threadripper will shine, because software in general is moving towards more threaded direction, I hear DX12/Vulcan is much better with drawcalls so that single thread is less limiting, which should allow more of the cores to work.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice