Speeding up the maths!

Category: visual studio vc

Question

PaulGGriffiths on Mon, 09 Jan 2017 13:38:47


Check this out! Twice as fast multiplications...

// Copyright 2017 Paul Griffiths.
// Public domain, this notice must be included!

#include "stdafx.h"
#include <iostream>
#include <time.h> 
using namespace std;

template <class T>
class VerifyType {
	T a, b, res;

public:
	VerifyType() {
		ch = false;
		res = 0;
	}

	void operator= (int val);
	T operator* (T val);

	operator T() { return a; }

	bool ch, bC;
};

template <class T>
void VerifyType<T>::operator= (int val)
{
	if (val != a)
	{
		a = val;
		ch = true;
	}
}

template <class T>
T  VerifyType<T>::operator* (T val)
{
	if (ch || b != val)
	{
		b = val;
		ch = false;
		return (res = a * val);
	}
	else
	{
		return res;
	}
}

int main() 
{
	int res;
	clock_t t1;
	clock_t t2;

	{
		int a;
		int b;
		int c;

		t1 = clock();

		a = 500;
		b = 100;
		for (int i = 0; i < 10000; i++)
		{
			res = a * b;
			cout << res << "\n";
		}

		c = 1000;
		for (int i = 0; i < 10000; i++)
		{

			res = a * c;
			cout << res << "\n";
		}

		b = 5000;
		for (int i = 0; i < 10000; i++)
		{
			res = a * b;
			cout << res << "\n";
		}

		t1 = clock() - t1;
	}

	{
		VerifyType<int> a;
		VerifyType<int> b;
		VerifyType<int> c;

		t2 = clock();

		a = 500;
		b = 100;
		for (int i = 0; i < 10000; i++)
		{
			res = a * b;
			cout << res << "\n";
		}

		c = 1000;
		for (int i = 0; i < 10000; i++)
		{

			res = a * c;
			cout << res << "\n";
		}

		b = 5000;
		for (int i = 0; i < 10000; i++)
		{
			res = a * b;
			cout << res << "\n";
		}

		t2 = clock() - t2;
	}

	printf("int's took  %d clicks (%f seconds).\n", t1, ((float)t1) / CLOCKS_PER_SEC);
	printf("VerifyType<int>'s took  %d clicks (%f seconds).\n", t2, ((float)t2) / CLOCKS_PER_SEC);

	return 0;
}

What do you think?

Still need to add addition, subtraction, division etc.

Replies

Igor Tandetnik on Mon, 09 Jan 2017 14:19:41


On 1/9/2017 8:38 AM, VerifyINT128 wrote:

Check this out! Twice as fast multiplications...

It's only faster when you multiply the same number by the same number over and over again, as your tests do. VerifyType simply caches the result of the most recent multiplication, and reuses it if the same pair of numbers needs to be multiplied again. Not very useful in everyday practical situations.

I predict this approach will actually hurt performance for addition and subtraction, even under your very artificial testing conditions. All those checks you need to perform would be more expensive than just simply performing the addition unconditionally.

PaulGGriffiths on Mon, 09 Jan 2017 14:47:18


Yes but if your performing the same calculations more than different calculations then it may be worth it?

Depense what you are doing.

It's the same for additon, subtraction, multiplication and division, about twice as fast.

Was only a little project of mine. Had to check it out.

If it was part of the compiler then it may be OK?




PaulGGriffiths on Mon, 09 Jan 2017 15:00:13


edit: removed

Igor Tandetnik on Mon, 09 Jan 2017 15:22:54


On 1/9/2017 9:47 AM, VerifyINT128 wrote:

Yes but if your performing the same calculations more than different calculations then it may be worth it?

But why would you want to make the same calculation multiple times? E.g. even if you wanted to print 10,000 copies of the same number, why would you write

 for (int i = 0; i < 10000; i++)
 {
     res = a * b;
     cout << res << "\n";
 }

and not

 res = a * b;
 for (int i = 0; i < 10000; i++)
 {
     cout << res << "\n";
 }

(note how, in the second sample, the multiplication is done once, outside of the loop).

With all due respect, yours is a solution in search of a problem.

Igor Tandetnik on Mon, 09 Jan 2017 15:34:33


On 1/9/2017 10:00 AM, VerifyINT128 wrote:

template <class T>
T  VerifyType<T>::operator+ (T val)
{
  if (c || add != val)
  {
    c = false;
    return (add = a + val);
  }
  else
  {
    return add;
  }
}

That won't work. If a==1 (or anything other than zero), and the caller adds 2, then add is set to 3. If the caller adds 2 again, add != val and you perform the addition again - you don't use the cache. What's worse, if the caller now adds 3, then you do use the cache and return 3, not 4.

PaulGGriffiths on Mon, 09 Jan 2017 16:01:53


I've updated the code, had mistakes as wrote it quickly, it's ok now.

If you go:

a = b + c;

e = c + d;

f = d + e;

Then it would be quicker if the values were the same. You just have to write it in such a way that you don't change it like:

a = b + c;

c = d + e; // c changed.

You have to use extra variables and only one calculation per line.

When the variables change often use standard types, when they don't change often use verifyType types

I only did it for fun to see if it was quicker, twice the speed is pretty good.

Uses A LOT of memory though.

I would try assembly if only I knew it.

I havent included comparitors ==, != <= and >= because I don't belive would be any quicker.











Igor Tandetnik on Mon, 09 Jan 2017 16:22:59


On 1/9/2017 11:01 AM, VerifyINT128 wrote:

but if you go:

a = b + c;

e = c + d;

f = d + e;

Then it would be quicker if the variables were the same.

I don't quite see where in this sequence VerifyType would help. Which addition could be eliminated thanks to the caching? As far as I can tell, VerifyType would do more work, not less: on top of the three additions (which it would have to do anyway), it also performs various comparisons, assignments to flags and such.

PaulGGriffiths on Mon, 09 Jan 2017 16:35:13


Try this, the results are about 3 times as fast:

int's took  12327 clicks (12.327000 seconds).
MType<int>'s took  4468 clicks (4.468000 seconds).

// Memoization Type
// Copyright 2017 Paul Griffiths.
// Public domain, this notice must be included!

#include "stdafx.h"
#include <iostream>
#include <time.h>
using namespace std;

template <class T>
class MType {
    T
        x,
        a, s, m, d, o, l, r,
        xa, xs, xm, xd, xo, xl, xr,
        ra, rs, rm, rd, ro, rl, rr;

public:
    MType()
        :
        x(0),
        c(false),
        a(0), s(0), m(0), d(0), o(0), l(0), r(0),
        xa(0), xs(0), xm(0), xd(0), xo(0), xl(0), xr(0),
        ra(0), rs(0), rm(0), rd(0), ro(0), rl(0), rr(0)
    {

    }

    void operator= (T v);
    void operator+= (T v);
    void operator-= (T v);
    void operator++ (T v);
    void operator-- (T v);
    void operator*= (T v);
    void operator/= (T v);
    void operator%= (T v);
    void operator<<= (T v);
    void operator>>= (T v);

    T operator+ (T v);
    T operator- (T v);
    T operator* (T v);
    T operator/ (T v);
    T operator% (T v);
    T operator>> (T v);
    T operator<< (T v);

    operator T() { return x; }

    bool c;
};

template <class T>
void MType<T>::operator= (T v)
{
    if (v != x)
    {
        x = v;
        c = true;
    }
}

template <class T>
void  MType<T>::operator+= (T v)
{
    x += v;
    c = true;
}

template <class T>
void  MType<T>::operator-= (T v)
{
    x -= v;
    c = true;
}

template <class T>
void  MType<T>::operator++ (T v)
{
    x++;
    c = true;
}

template <class T>
void  MType<T>::operator-- (T v)
{
    x--;
    c = true;
}

template <class T>
void  MType<T>::operator*= (T v)
{
    x *= v;
    c = true;
}

template <class T>
void  MType<T>::operator/= (T v)
{
    x /= v;
    c = true;
}

template <class T>
void  MType<T>::operator%= (T v)
{
    x %= v;
    c = true;
}

template <class T>
void  MType<T>::operator>>= (T v)
{
    x >>= v;
    c = true;
}

template <class T>
void  MType<T>::operator<<= (T v)
{
    x <<= v;
    c = true;
}

template <class T>
T  MType<T>::operator+ (T v)
{
    if (c || xa != v)
    {
        xa = v;
        c = false;
        return (ra = x + v);
    }
    else
    {
        return ra;
    }
}

template <class T>
T  MType<T>::operator- (T v)
{
    if (c || xs != v)
    {
        xs = v;
        c = false;
        return (rs = x - v);
    }
    else
    {
        return rs;
    }
}

template <class T>
T  MType<T>::operator* (T v)
{
    if (c || xm != v)
    {
        xm = v;
        c = false;
        return (rm = x * v);
    }
    else
    {
        return rm;
    }
}

template <class T>
T  MType<T>::operator/ (T v)
{
    if (c || xd != v)
    {
        xd = v;
        c = false;
        return (rd = x / v);
    }
    else
    {
        return rd;
    }
}

template <class T>
T  MType<T>::operator% (T v)
{
    if (c || xo != v)
    {
        xo = v;
        c = false;
        return (ro = x & v);
    }
    else
    {
        return ro;
    }
}

template <class T>
T  MType<T>::operator<< (T v)
{
    if (c || xl != v)
    {
        xl = v;
        c = false;
        return (rl = x << v);
    }
    else
    {
        return rl;
    }
}

template <class T>
T  MType<T>::operator>> (T v)
{
    if (c || xr != v)
    {
        xr = v;
        c = false;
        return (rr = x >> v);
    }
    else
    {
        return rr;
    }
}

int main()
{

    int res;
    clock_t t1;
    clock_t t2;

    {
        int a;
        int b;
        int c;
        int d;
        int e;
        int f;

        t1 = clock();

        res = 0;
        a = 1000;
        b = 2000;
        c = 3000;
        d = 4000;
        e = 5000;
        f = 6000;
        for (int i = 0; i < 10000; i++)
        {
            a = b + c;
            e = c + d;
            f = d + e;
            res += a + e + f;
            cout << res << "\n";
        }
        

        t1 = clock() - t1;
    }

    {
        MType<int> a;
        MType<int> b;
        MType<int> c;
        MType<int> d;
        MType<int> e;
        MType<int> f;

        t2 = clock();

        res = 0;
        a = 1000;
        b = 2000;
        c = 3000;
        d = 4000;
        e = 5000;
        f = 6000;
        for (int i = 0; i < 10000; i++)
        {
            a = b + c;
            e = c + d;
            f = d + e;
            res += a + e + f;
            cout << res << "\n";
        }
        

        t2 = clock() - t2;
    }

    printf("int's took  %d clicks (%f seconds).\n", t1, ((float)t1) / CLOCKS_PER_SEC);
    printf("MType<int>'s took  %d clicks (%f seconds).\n", t2, ((float)t2) / CLOCKS_PER_SEC);

    return 0;
}






Pavel A on Mon, 09 Jan 2017 16:56:25


Memoization.

PaulGGriffiths on Mon, 09 Jan 2017 17:03:24


Thanks, ill rename VerifyType to MType!

PaulGGriffiths on Mon, 09 Jan 2017 17:42:44


Heres MDType for double and float:

// Memoization MDType Template
// Copyright 2017 Paul Griffiths.
// Public domain, this notice must be included!

#pragma once
template <class T>
class MDType {
	T
		x,
		a, s, m, d, o, l, r,
		xa, xs, xm, xd, xo, xl, xr,
		ra, rs, rm, rd, ro, rl, rr;

public:
	MDType()
		:
		x(0),
		c(false),
		a(0), s(0), m(0), d(0), o(0), l(0), r(0),
		xa(0), xs(0), xm(0), xd(0), xo(0), xl(0), xr(0),
		ra(0), rs(0), rm(0), rd(0), ro(0), rl(0), rr(0)
	{

	}

	void operator= (T v);
	void operator+= (T v);
	void operator-= (T v);
	void operator*= (T v);
	void operator/= (T v);
	void operator%= (T v);
	void operator<<= (T v);
	void operator>>= (T v);

	T operator+ (T v);
	T operator- (T v);
	T operator* (T v);
	T operator/ (T v);
	T operator% (T v);
	T operator>> (T v);
	T operator<< (T v);

	operator T() { return x; }

	bool c;
};

template <class T>
void MDType<T>::operator= (T v)
{
	if (v != x)
	{
		x = v;
		c = true;
	}
}

template <class T>
void  MDType<T>::operator+= (T v)
{
	x += v;
	c = true;
}

template <class T>
void  MDType<T>::operator-= (T v)
{
	x -= v;
	c = true;
}

template <class T>
void  MDType<T>::operator*= (T v)
{
	x *= v;
	c = true;
}

template <class T>
void  MDType<T>::operator/= (T v)
{
	x /= v;
	c = true;
}

template <class T>
void  MDType<T>::operator%= (T v)
{
	x %= v;
	c = true;
}

template <class T>
void  MDType<T>::operator>>= (T v)
{
	x >>= v;
	c = true;
}

template <class T>
void  MDType<T>::operator<<= (T v)
{
	x <<= v;
	c = true;
}

template <class T>
T  MDType<T>::operator+ (T v)
{
	if (c || xa != v)
	{
		xa = v;
		c = false;
		return (ra = x + v);
	}
	else
	{
		return ra;
	}
}

template <class T>
T  MDType<T>::operator- (T v)
{
	if (c || xs != v)
	{
		xs = v;
		c = false;
		return (rs = x - v);
	}
	else
	{
		return rs;
	}
}

template <class T>
T  MDType<T>::operator* (T v)
{
	if (c || xm != v)
	{
		xm = v;
		c = false;
		return (rm = x * v);
	}
	else
	{
		return rm;
	}
}

template <class T>
T  MDType<T>::operator/ (T v)
{
	if (c || xd != v)
	{
		xd = v;
		c = false;
		return (rd = x / v);
	}
	else
	{
		return rd;
	}
}

template <class T>
T  MDType<T>::operator% (T v)
{
	if (c || xo != v)
	{
		xo = v;
		c = false;
		return (ro = x & v);
	}
	else
	{
		return ro;
	}
}

template <class T>
T  MDType<T>::operator<< (T v)
{
	if (c || xl != v)
	{
		xl = v;
		c = false;
		return (rl = x << v);
	}
	else
	{
		return rl;
	}
}

template <class T>
T  MDType<T>::operator>> (T v)
{
	if (c || xr != v)
	{
		xr = v;
		c = false;
		return (rr = x >> v);
	}
	else
	{
		return rr;
	}
}

PaulGGriffiths on Mon, 09 Jan 2017 17:50:02


Too be honist I think it's pretty clever and ill probably use it in my latest directX 12 project.

I may even rewrite the vector + matrix directx 12 code.

May be able to make a struct version for shader code, that may ge great.



Igor Tandetnik on Mon, 09 Jan 2017 18:12:11


On 1/9/2017 11:35 AM, VerifyINT128 wrote:

If a, e and f calculations were performed repeatedly then they would be cached and be twice quicker.

But why, practically speaking, would you perform these calculations repeatedly?

If the performance bottleneck in your application involves the same calculations being performed repeatedly, then you are in luck: there is a quick and easy way to improve the performance - just stop performing the same calculations repeatedly. Rearrange your code to perform them only once.

I don't see the need to design a caching mechanism that helps in the unusual case of making the exact same computation multiple times, but hurts in the much more common case of performing multiple computations on distinct values. Again, all that checking and maintaining state has a cost.

You could always try it out.

Though when you go:

for (int i = 0; i < 10000; i++)
{
     res = a + b;
     cout << res << "\n";
}

It takes x miliseconds

when you just go:

for (int i = 0; i < 10000; i++)
{
     res = a + b;
}

it always takes 0 miliseconds, i'm not sure why my test does this?

In a release build, the compiler likely notices that the second loop does exactly nothing, and optimizes the whole thing away. In the debug build, the optimizer is probably not that aggressive - but on modern CPUs performing 10,000 additions is extremely fast, taking much less than 1ms. In the first loop, you are not timing the addition at all - you are measuring the time it takes to format and print the number. Compared to that, the addition is essentially instantaneous.

Igor Tandetnik on Mon, 09 Jan 2017 18:18:51


On 1/9/2017 12:42 PM, VerifyINT128 wrote:

Heres MDType for double and float:

int main()
{
        MDType<int> caching;
        caching = 1;
        std::cout << "1+2=" << caching + 2 << std::endl;
        std::cout << "1-0=" << caching - 0 << std::endl;
}

Output:

1+2=3
1-0=0

See anything wrong with this picture?

PaulGGriffiths on Mon, 09 Jan 2017 18:22:09


Try making an optomized version of this:

 for (int i = 0; i < 10000; i++)
{
a = b + c;
e = c + d;
f = d + e;
res += a + e + f;
cout << res << "\n";
}

You could do this somewhere else in your code

 a = b + c;
e = c + d;
f = d + e;

and go:

for (int i = 0; i < 10000; i++)
{
res += a + e + f;
cout << res << "\n";
}

But a + e + f still has to be performed. So may still aswell use MType's.

So you create a variable x for a + e + f; But then variable x may not change often, so may as well use MType.

Writing very optomized code for every calculation both adds complexity to the code and takes effort.

My template is pretty good if you can afford the memory and know WHEN and WHEN NOT to use it.





Igor Tandetnik on Mon, 09 Jan 2017 18:34:19


On 1/9/2017 1:22 PM, VerifyINT128 wrote:

You could do this somewhere else in your code

 a = b + c;
        e = c + d;
        f = d + e;

and go:

for (int i = 0; i < 10000; i++)
{
        res += a + e + f;
        cout << res << "\n";
}

But a+b etc still has to be performed. So may still aswell use MType's.

I can also do
 a = b + c;
 e = c + d;
 f = d + e;
 int g = a + e + f;
for (int i = 0; i < 10000; i++)
{
        res += g;
        cout << res << "\n";
}

Same caching as what MType does, except with no overhead.

PaulGGriffiths on Mon, 09 Jan 2017 18:42:23


Your still performing:

a = b + c; e = c + d; f = d + e;

These could be MType, as could g.

If b or c change often then don't use MType.

If a variable changes half the time then you will gain nothing.

If it changes more than half the time then your be at a loss.

If a variable changes all the time then don't use MType.

Else it's somewhere between nothing and 3 times as fast, think it is worth it.

Quite easy to be honist.

This idea needs to be included into the compiler as a lot of math is done behind the scene.

On most computers you run out of processing power before you run out of memory.

Though MType uses 22 times more memory than a standard type.

I would'nt use it in large arrays.





PaulGGriffiths on Mon, 09 Jan 2017 19:15:24


The fully optomised version is:

int a, b, c ,d , e, f;

a = b + c;
 e = c + d;
 f = d + e;

bool bChanged = false;

bool cChanged = false;

bool dChanged = false;

bool eChanged = false;

...

b = someNumber;

bChanged = true;

...

if (bChanged || cChanged)

a = b + c;

if (cChanged || dChanged)

e = c + d;

if (dChanged || eChanged)

f = d + e;

bChanged = false;

cChanged = false;

dChanged = false;

eChanged = false;


If you want to?

And thats just for a simple example, imagine something complex.

Infact I doubt you would bother, have you ever seen complex code do it?

Put the effort in to it and at best it's is 3 times faster.

Think of something like a music synthersiser, doesnt use all that much math,

but is ran at 44100 times a second.

MType would be good for a synth. Or write a lot of xChanged bool's.












WayneAKing on Mon, 09 Jan 2017 19:29:45


Try this, the results are about 3 times as fast:

int's took  12327 clicks (12.327000 seconds).
MType<int>'s took  4468 clicks (4.468000 seconds).

Using which compiler?

Using which build properties/settings?

Using what speed computer with what hardware configurations?

Assuming that you're using VC++ try some comparative builds and tests.

(1) Debug build with default options.

(2) Release build with default options (Optimize: Maximize Speed)

(3) Release build with Optimizations Disabled.

etc.

Then repeat all the same tests after changing the order of the calculations
in your program. Instead of doing the int calcs first followed by the MType
calcs, do the MType calcs first and then the int calcs. Compare the results.

My results:

Release build (Optimizations disabled /Od):

int's took  7800 clicks (7.800000 seconds).
MType<int>'s took  7440 clicks (7.440000 seconds).

Same but with MType calcs first:

int's took  7580 clicks (7.580000 seconds).
MType<int>'s took  7960 clicks (7.960000 seconds).

- Wayne

PaulGGriffiths on Mon, 09 Jan 2017 19:36:09


What calc did you perform?My compiler is visual studio 2017 on windows 10, standdard compiler settings as set when creating a new console app.

intel 3510 3.0G.

Release build (Optimizations disabled /Od):

int's took  11682 clicks (11.682000 seconds).
MType<int>'s took  4453 clicks (4.453000 seconds).

Doing the int and MType the other way around and I get:

int's took  4412 clicks (4.412000 seconds).
MType<int>'s took  11800 clicks (11.800000 seconds).

Why is that?




PaulGGriffiths on Mon, 09 Jan 2017 19:53:34


Int's first is:

int's took  11828 clicks (11.828000 seconds).
MType<int>'s took  4291 clicks (4.291000 seconds).

MType first is:

int's took  4311 clicks (4.311000 seconds).
MType<int>'s took  12014 clicks (12.014000 seconds).

Strange thing going on here.

So:

int's 11828 + 43111 = 16139.

MType 12014 + 4291 = 16305.

About the same. Could be wasting my time here then?

So why is it different depending which way you do it?

What's holding up the second test?



WayneAKing on Mon, 09 Jan 2017 19:54:45


What calc did you perform?My compiler is visual studio 2017 on windows 10, standdard compiler settings as set when creating a new console app.

intel 3510 3.0G.

I used the code exactly as you posted it. For the second test I simply
moved the code block with the MType test up to before the code block
with the int tests. The timings I posted were executing from the IDE.

Running the same tests from the command line:

Same from the command line (MType calcs first):

int's took  7400 clicks (7.400000 seconds).
MType<int>'s took  7400 clicks (7.400000 seconds).

C:\...ojects\Memoization tests\Release>

From the command line with int calcs first:

int's took  7540 clicks (7.540000 seconds).
MType<int>'s took  7440 clicks (7.440000 seconds).

C:\...ojects\Memoization tests\Release>

Using VS 2012 Express, Win7 64-bit, on an AMD 8-core processor with
3.1 GHz and 16GB memory. Program built as 32-bit.

- Wayne

Wyck on Mon, 09 Jan 2017 20:01:05


If you comment out the couts:

int's took  6 clicks (0.006000 seconds).
VerifyType<int>'s took  132 clicks (0.132000 seconds).

Igor Tandetnik on Mon, 09 Jan 2017 20:08:49


On 1/9/2017 2:15 PM, VerifyINT128 wrote:

The fully optomised version is:
int a, b, c ,d , e, f;

a = b + c;
 e = c + d;
 f = d + e;

bool bChanged = false;

bool cChanged = false;

bool dChanged = false;

bool eChanged = false;


...

b = someNumber;

bChanged = true;

Presumably, c and d are also given some initial values, so cChanged and dChanged should also be initialized to true.

if (bChanged || cChanged)

This condition is true

    a = b + c;

So this addition is executed. You also forgot to set aChanged and clear bChanged

if (cChanged || dChanged)    e = c + d;

The condition is true, so the addition is executed. You should also set eChanged and clear cChanged

if (dChanged || eChanged)    f = d + e;

This condition is true, so the addition is executed.

In the end, all that change tracking didn't save you a single operation, but added a bunch of overhead. Which is exactly what happens when your MType is used. It actually does more work - not only does it track "changed" flag, but also the most recently used right-hand operand.

Put the effort in to it and at best it's is 3 times faster.

How can it possibly be faster, when it does strictly more work? Where do you expect the savings to come from?

PaulGGriffiths on Mon, 09 Jan 2017 20:10:12


Performed the test twice this is the results:

int's first

test1 int's took  12214 clicks (12.214000 seconds).
test1 MType<int>'s took  4432 clicks (4.432000 seconds).
test2 int's took  4206 clicks (4.206000 seconds).
test2 MType<int>'s took  4233 clicks (4.233000 seconds).

MType's first

test1 int's took  4363 clicks (4.363000 seconds).
test1 MType<int>'s took  11789 clicks (11.789000 seconds).
test2 int's took  4582 clicks (4.582000 seconds).
test2 MType<int>'s took  4746 clicks (4.746000 seconds).

This is the code:

int main()
{
	int i;
	int res;
	clock_t t1;
	clock_t t2;
	clock_t t3;
	clock_t t4;
	MDType<int> a1;
	MDType<int> b1;
	MDType<int> c1;
	MDType<int> d1;
	MDType<int> e1;
	MDType<int> f1;

	int a2;
	int b2;
	int c2;
	int d2;
	int e2;
	int f2;


	a1 = 1.234;
	b1 = 2.345;
	c1 = 3.456;
	d1 = 4.567;
	e1 = 5.678;
	f1 = 6.789;

	a2 = 1.234;
	b2 = 2.345;
	c2 = 3.456;
	d2 = 4.567;
	e2 = 5.678;
	f2 = 6.789;

	res = 0;
	i = 0;
	t2 = clock();
	for (i = 0; i < 10000; i++)
	{
		a1 = b1 * c1;
		e1 = c1 * d1;
		f1 = d1 * e1;
		res += a1 * e1 * f1;
		cout << res << "\n";
	}
	t2 = clock() - t2;

	res = 0;
	i = 0;
	t1 = clock();
	for (i = 0; i < 10000; i++)
	{
		a2 = b2 * c2;
		e2 = c2 * d2;
		f2 = d2 * e2;
		res += a2 * e2 * f2;
		cout << res << "\n";
	}
	t1 = clock() - t1;

	res = 0;
	i = 0;
	t4 = clock();
	for (i = 0; i < 10000; i++)
	{
		a1 = b1 * c1;
		e1 = c1 * d1;
		f1 = d1 * e1;
		res += a1 * e1 * f1;
		cout << res << "\n";
	}
	t4 = clock() - t4;

	res = 0;
	i = 0;
	t3 = clock();
	for (i = 0; i < 10000; i++)
	{
		a2 = b2 * c2;
		e2 = c2 * d2;
		f2 = d2 * e2;
		res += a2 * e2 * f2;
		cout << res << "\n";
	}
	t3 = clock() - t3;


	printf("test1 int's took  %d clicks (%f seconds).\n", t1, ((float)t1) / CLOCKS_PER_SEC);
	printf("test1 MType<int>'s took  %d clicks (%f seconds).\n", t2, ((float)t2) / CLOCKS_PER_SEC);

	printf("test2 int's took  %d clicks (%f seconds).\n", t3, ((float)t3) / CLOCKS_PER_SEC);
	printf("test2 MType<int>'s took  %d clicks (%f seconds).\n", t4, ((float)t4) / CLOCKS_PER_SEC);

	char input;
	cin >> input;

	return 0;
}

WayneAKing on Mon, 09 Jan 2017 20:15:00


If you comment out the couts:

int's took  6 clicks (0.006000 seconds).
VerifyType<int>'s took  132 clicks (0.132000 seconds).

What are you using, an XT? On mine if I comment out the cout lines
the result is:

int's took  0 clicks (0.000000 seconds).
MType<int>'s took  0 clicks (0.000000 seconds).

- Wayne

PaulGGriffiths on Mon, 09 Jan 2017 20:19:55


If you perform the test without the cout's it changes each time.

That's why I included them.

Igor Tandetnik on Mon, 09 Jan 2017 20:22:17


On 1/9/2017 3:10 PM, VerifyINT128 wrote:

Performed the test twice this is the results:

int's first

test1 int's took  12214 clicks (12.214000 seconds).
test1 MType<int>'s took  4432 clicks (4.432000 seconds).
test2 int's took  4206 clicks (4.206000 seconds).
test2 MType<int>'s took  4233 clicks (4.233000 seconds).



MType's first

test1 int's took  4363 clicks (4.363000 seconds).
test1 MType<int>'s took  11789 clicks (11.789000 seconds).
test2 int's took  4582 clicks (4.582000 seconds). test2 MType<int>'s took 4746 clicks (4.746000 seconds).

Now add a loop that doesn't perform any computation at all, but just prints the same number 10,000 times. Like this:

                int res = 100;
    for (i = 0; i < 10000; i++)
    {
 cout << res << "\n";
    }

I predict it would take about the same 4 seconds (after warm-up).

Again, you are not measuring the time it takes to perform arithmetic - you are measuring the time it takes to produce output, which dwarfs any arithmetic by orders of magnitude.

As to why the very first test takes longer than others: the first time std::cout is used, all kinds of OS DLLs need to be loaded and initialized to perform I/O. That takes time, once. That one-time hit is sometimes referred to as warm-up or warming-up (by an analogy with old lamp-based devices that needed some time to literally warm up to become fully operational when turned on).

PaulGGriffiths on Mon, 09 Jan 2017 20:37:11


I performed this test, seems I'm completly wasting my time: Sorry people.

It seems C++ classes are real slow, think ill go back a step and program in just C, no classes or struct's.

test1 int's took  56 clicks (0.056000 seconds).
test1 MType<int>'s took  713 clicks (0.713000 seconds).
test2 int's took  56 clicks (0.056000 seconds).
test2 MType<int>'s took  631 clicks (0.631000 seconds).

int main()
{
	int i, j;
	int res;
	clock_t t1;
	clock_t t2;
	clock_t t3;
	clock_t t4;

	int a1;
	int b1;
	int c1;
	int d1;
	int e1;
	int f1;

	MType<int> a2;
	MType<int> b2;
	MType<int> c2;
	MType<int> d2;
	MType<int> e2;
	MType<int> f2;


	a1 = 1;
	b1 = 2;
	c1 = 3;
	d1 = 4;
	e1 = 5;
	f1 = 6;

	a2 = 1;
	b2 = 2;
	c2 = 3;
	d2 = 4;
	e2 = 5;
	f2 = 6;

	res = 0;
	i = 0;
	j = 0;
	t1 = clock();
	for (j = 0; j < 1000; j++)
	{
		for (i = 0; i < 10000; i++)
		{
			a1 = b1 * c1;
			e1 = c1 * d1;
			f1 = d1 * e1;
			res += a1 * e1 * f1;
		}
	}
	cout << res << "\n";
	t1 = clock() - t1;

	res = 0;
	i = 0;
	j = 0;
	t2 = clock();
	for (j = 0; j < 1000; j++)
	{
		for (i = 0; i < 10000; i++)
		{
			a2 = b2 * c2;
			e2 = c2 * d2;
			f2 = d2 * e2;
			res += a2 * e2 * f2;
		}
	}
	cout << res << "\n";
	t2 = clock() - t2;

	res = 0;
	i = 0;
	j = 0;
	t3 = clock();
	for (j = 0; j < 1000; j++)
	{
		for (i = 0; i < 10000; i++)
		{
			a1 = b1 * c1;
			e1 = c1 * d1;
			f1 = d1 * e1;
			res += a1 * e1 * f1;
		}
	}
	cout << res << "\n";
	t3 = clock() - t3;

	res = 0;
	i = 0;
	j = 0;
	t4 = clock();
	for (j = 0; j < 1000; j++)
	{
		for (i = 0; i < 10000; i++)
		{
			a2 = b2 * c2;
			e2 = c2 * d2;
			f2 = d2 * e2;
			res += a2 * e2 * f2;
		}
	}
	cout << res << "\n";
	t4 = clock() - t4;

	printf("test1 int's took  %d clicks (%f seconds).\n", t1, ((float)t1) / CLOCKS_PER_SEC);
	printf("test1 MType<int>'s took  %d clicks (%f seconds).\n", t2, ((float)t2) / CLOCKS_PER_SEC);

	printf("test2 int's took  %d clicks (%f seconds).\n", t3, ((float)t3) / CLOCKS_PER_SEC);
	printf("test2 MType<int>'s took  %d clicks (%f seconds).\n", t4, ((float)t4) / CLOCKS_PER_SEC);

	char input;
	cin >> input;

	return 0;
}





Wyck on Mon, 09 Jan 2017 20:39:53


Just build it in release mode and output a listing file, then you'll see what code really got generated.  I suspect Igor's right that it completely ignores the native multiplication and times the cout alone.

I also think the time is massively skewed by the initialization cost of the first cout.

Try reversing the order of the tests.  Test your VerifyType block first and the native multiplies second.