threadripper bottleneck???

gameboy3800 · Sep 1, 2018

it is so cheap now to get a 6+ core processor from either intel or amd where before they were a $1000+ luxury enthusiast piece. it's only a matter of time before true high core count parts will outshine the higher clockspeed mentality we're in now.

fufsgfen · Sep 1, 2018

gameboy3800 said: ↑

it is so cheap now to get a 6+ core processor from either intel or amd where before they were a $1000+ luxury enthusiast piece. it's only a matter of time before true high core count parts will outshine the higher clockspeed mentality we're in now.
Click to expand...

Yeah, also there is not much to be gained in single core speed with current technology, sure they are pushing to 10Nm and lower which gives some gains, but to get any significant improvement within 5 years, I would not hold my breath, there just is not much left, but cores can be added, so that will just add to evolution of software to use more cores.

Any software that is relying to single core speed in 5 years from now, would be probably considered as a relic of the past, movement to multicore has been so strong at recent years and lots because of what AMD has been doing, affordable cores. Also it is cheaper to produce more cores than one really really high end, so I can't see that in future anyone gets away of having single core limitations, that would cause quite negative remarks at reviews etc.

BeamNG is great there that it can use all cores for physics, but bad in there as graphics are limited to mostly single core, which means that even 0.13.0.5 uses 20% less of each core on physics compared to 0.8, one cannot really use more cars because graphics are still limited to that single core so amount of cars I can run is still the same as before.

I'm not sure if DX12 is the solution, it might be that Torque 3D is one that is more responsible of using only 1 core for some graphics, but I'm quite confident that in future situation improves as it is really only way to have more cars or more complex maps working well with good graphics quality.

I guess highest 1 car result in Banana bench is something like 56 Mbeams/s

I get bit over 40 Mbeams/s so there is something like 40% more available, if graphics etc. would use 4 threads instead of one, that would be 4 times of what I have now, equivalent of having 160 Mbeams/s with single core, so yeah multi core is only way to go for software, however it is not easy route.

About drawcalls, this might be interesting https://stackoverflow.com/questions/4853856/why-are-draw-calls-expensive
Code:
First of all, I'm assuming that with "draw calls", you mean the command that tells the GPU to render a certain set of vertices as triangles with a certain state (shaders, blend state and so on).

Draw calls aren't necessarily expensive. In older versions of Direct3D, many calls required a context switch, which was expensive, but this isn't true in newer versions.

The main reason to make fewer draw calls is that graphics hardware can transform and render triangles much faster than you can submit them. If you submit few triangles with each call, you will be completely bound by the CPU and the GPU will be mostly idle. The CPU won't be able to feed the GPU fast enough.

Making a single draw call with two triangles is cheap, but if you submit too little data with each call, you won't have enough CPU time to submit as much geometry to the GPU as you could have.

There are some real costs with making draw calls, it requires setting up a bunch of state (which set of vertices to use, what shader to use and so on), and state changes have a cost both on the hardware side (updating a bunch of registers) and on the driver side (validating and translating your calls that set state).

But the main cost of draw calls only apply if each call submits too little data, since this will cause you to be CPU-bound, and stop you from utilizing the hardware fully.

Just like Josh said, draw calls can also cause the command buffer to be flushed, but in my experience that usually happens when you call SwapBuffers, not when submitting geometry. Video drivers generally try to buffer as much as they can get away with (several frames sometimes!) to squeeze out as much parallelism from the GPU as possible
While Beam does pack identical objects to batch which helps with the issue, it is with identical objects, that is why I can put so much stuff on screen as I'm using this magician trick of illusion, each object is identical, those engine can do a loads without chocking single thread of the CPU.

On Roane County, you get very small amount of Triangles considering how much different kind of stuff there is.

Two aspects are getting expensive, each texture is it's own file, object's are small with low number of triangles, so CPU gets very loaded and can't feed GPU.

WCUSA has large objects, which do limit usage of LODs, but as GPU power is available it can be used to help CPU, combining objects and textures helps with that. Another way is to use identical objects as much as possible, clever placement can help to avoid repeating etc.

Reflections then multiply everything, with faces per update at maximum scene is rendered 6 times more per frame than without reflections, so each of those 6 times is doing all over again those expensive calls.

Now understanding this nature of things helps to see a bit how to make Beam run best.

Forums

Mods

threadripper bottleneck???

gameboy3800
Expand Collapse

fufsgfen
Expand Collapse

Useful Searches

threadripper bottleneck???

gameboy3800 Expand Collapse

fufsgfen Expand Collapse

gameboy3800
Expand Collapse

fufsgfen
Expand Collapse