significant performance degredation since blender 2.02

General discussion about the development of the open source Blender

Moderators: jesterKing, stiv

Post Reply
dbowie
Posts: 0
Joined: Wed Jul 30, 2003 4:47 am

significant performance degredation since blender 2.02

Post by dbowie » Wed Jul 30, 2003 4:59 am

Hello all,

I recently tried out the "blenchmark" performance test on 2.28 and was worried with the numbers that were being reported. Sure enough, after trying various archived releases, I had to go back to 2.02 to get the blenchmark numbers like I remebered. Here are the numbers I get:

Using: blender -p 100 100 900 700 blacksmith.blend

Version | draw | draw+z

2.28 | 64 | 95
2.02 | 7 | 9


My system: 800 MHz Duron, 256 MB RAM, GeForce 3 Ti 200 (64 MB), newest NVidia drivers.

I observed similar numbers both in linux and in windows.

What gives? Anyone else experiencing this?

- Mark

theeth
Posts: 500
Joined: Wed Oct 16, 2002 5:47 am
Location: Montreal
Contact:

Post by theeth » Wed Jul 30, 2003 5:13 am

three words:

font anti aliasing

that's why

Martin
Life is what happens to you when you're busy making other plans.
- John Lennon

dbowie
Posts: 0
Joined: Wed Jul 30, 2003 4:47 am

Post by dbowie » Wed Jul 30, 2003 5:31 am

Martin,

Just double checked...it's not the anti-aliased fonts cauing the problem. I still get a number in the 60s for the "draw" test with the international font support enabled (and the fonts are nice and smooth) or disabled (and the fonts are nice and jaggy :)) when using 2.28.

It is something more than this. Any ideas? Can someone confirm or deny what I am observing by trying this him/herself?

xype
Posts: 127
Joined: Tue Oct 15, 2002 10:36 pm

Post by xype » Wed Jul 30, 2003 11:18 pm

dbowie wrote:Martin,
or disabled (and the fonts are nice and jaggy :)) when using 2.28.


Did you disable mipmaps and enable vertex arrays in the system & OpenGL panel, too?

thorax
Posts: 320
Joined: Sun Oct 27, 2002 6:45 am
Contact:

Post by thorax » Thu Jul 31, 2003 12:42 am

Is there any way to trace runtime throughout
code without the blenchmarking code or is that sufficient?
If we could put some atomic code in blender, say one statement
for every function that just diffs a global variable (justifable global variable), possibly a#define inline function.. Then use that to
judge runtime per function, might be able to determine things like
libraries and functions that are being less optimal..

I mean something like but as inline code or something
that can be easily added into the code..

Code: Select all

/* imagine this for every function */
function myfunction() { 
static int BENCHIT_deltaclock = 0, BENCHIT_times = 0, BENCHIT_totaltime = 0; 
BENCHIT_times++; BENCHIT_deltaclock = REG_CPUCLOCK; 

/* CPUCLOCK assumed to be either a function call or an everchanging register in memory */ 

/* code for myfunction  here */

/* add up deltas , note I'm storing values in local statics, so no need for globals.. */
BENCHIT_totaltime += REG_CPUCLOCK - BENCHIT_deltaclock;

/* 
Periodically a node in blender that collects benchmarks flags BENCHIT_COLLECTTIMES, that causes all the code to register its 
values with the database according to its location, it would be best to 
get the name of the function, source file name, and line numbers 
registered at compile time so that the code can easily subtituted into a 
precompiler directive and added everywhere easily.  Also the variable names selected should be least likely to occur in the source code, 
so that there is no variable conflicts.. Its possible the compiler could 
auto-generate the code for every function thus making the job easier..
The entries into the benchmark database are made unique with 
a combination of source file name, line number and function name, to 
make it filterable and non overlapping int he database.. The COLLECTTIMES flag should only be set like every 10,000 clock cycles or something rather large so that it doesn't degrade performance. The benchmarking code would also probably add about 10 extra instructions 
to ever function.. But the code should be constant size and easily worked out of the benchmark computation, it should be atomic so that it can't 
be shared (probably nto possible)..  
*/

if (BENCHIT_COLLECTTIMES) BENCHITDATABASE(__THISFUNCTION,__SOURCEFILENAME,__LINE,BENCHIT_totaltime, BENCHIT_deltaclock, BENCHIT_times);
}
}

The reason I designed the code like this is so it could be reduced to
something like

Code: Select all

int myfunction () { 
BENCHIT_START();

/* the code to myfunction here */

BENCHIT_STOP(); /* this is not a function its subtituted ti inline source */
}
later on it might be possible to write the code such that it
traces runtime of subcalls and recursion by registering of
"CALLED_FROM" data into the database.. Again this is dependent on the directives that exist in the preprocessor or in the compiler, somethings are just easier for the compiler or parser to do like systematically add code to the sources at function/module boundaries.


This is like source code I used on a palm pilot to trace functions
independent of a debugger, analyze handshaking events, because I was using IR communication for events that occured too quickly for a debugger to connect.. IT worked better than using a debugger.. But depends on code being executable sequential order, I can't imagine how this code could be written to work in a threaded environment..

This is merely pseudocode, but a start on the concept of being able to
track the runtimes of all the code in blender without use of globals..

Someone probably has a better solution, probably using a compiler directive or a debugger, but the idea here is to allow any user anywhere to collect runtime information without anything special, and to allow adding of code without costing maintainability.. The purpose for the call to the database is to allow all the functions to transmit values back without
using shared memory techniques, globals, or compiling of C source into object files and using a function to reference the variables in the module (the C way of implementing objects, note C++ can instantiate these objects, C can't.. That' what I mean by nailing objects to the floor..

Also if the compiler can do this without requiring static insertion of the code into the functions, that's better than COPY/PASTE the code across the source distribution which could lead to bugs.. I guess there are C preprocessors that are smarter about parsing than the old C procompiler/preprocessor worked..

Anyhow..

dbowie
Posts: 0
Joined: Wed Jul 30, 2003 4:47 am

Post by dbowie » Thu Jul 31, 2003 2:29 am

xype wrote: Did you disable mipmaps and enable vertex arrays in the system & OpenGL panel, too?
Just tried it. I get the following numbers (only get a slight improvement):

draw: 58
draw+z: 96

xype
Posts: 127
Joined: Tue Oct 15, 2002 10:36 pm

Post by xype » Thu Jul 31, 2003 7:41 am

dbowie wrote:
xype wrote: Just tried it. I get the following numbers (only get a slight improvement):

draw: 58
draw+z: 96

Hm the only other thing I can suggst is to turn your dosplay to 640x480 and 16 bit. But I doubt you'd want to work with that :wink:

Post Reply