Tags

, , ,

I started C# programming with a image manipulation program, it was a Windows Phone application and it was a very huge project compared to apps we have now for smart phone. Unfortunately we are still hiding it from the public because people never respect the value of that software when we demo-ed it to some so called Techies in 2013. Six months later, I had “Digital Image Processing” course work my degree program which taught me the real behind the scenes of what I wrote as a code earlier.

That code had lots of performance issues, it took a considerable time to analyze an 8MP image. I didn’t touch the code after we demo-ed and got rejected (but the reject was really not for its performance). Years later, I am working on another performance issue on another software and one of my conclusion was to change some data-structures, actually they are kind of collections to some good performing ones.

I reopened the old image manipulation program I wrote in 2013 and notices I ‘ve used Arrays and Lists primarily as Collection data structures, I was thinking of how can I improve them and what could be the best fit data-structure for this kind of work.

How image recognition works

You know C# is awesome, it provide several ways to manipulate an image a very well known feature is ColorMatrix class with it we can very simply adjust RGB, Hue, Saturation, Brightness of an image.

Normally image recognition algorithms, or some kind of image manipulations like sharping, blurring require a gray-scale image  and it to be threshold-ed.

mona_threshed

Then a common mistake we does is store bits of the threshold-ed image to an Array that accepts bytes. You know byte can accept values form 0 to 255 but the actual values we are going to put are just 0 and 1, for threshold-ed image because it actually contains both pixels that are fully black and fully white after it we manipulate it with bitwise operations like XOR, OR and NOT. Here comes the real need if BitArray.

BitArray in C#

It is there since .NET 1.1 and it has several constructors, so it can accept a Boolean array or a Byte array or an Integer array or even another BitArray. But the beauty of this is this has built in functions for operations like NOT, AND, OR and XOR which are mostly optimized and performs better.

For a quick example,

BitArray ba1 = new BitArray(8);
byte[] a = { 255 };

ba1 = new BitArray(a);

for (int i = 0; i < ba1.Count; i++)
{
Console.Write("{0, -6} ", ba1[i]);
}

Out put of this comes as
True True True True True True True True
like wise, we can directly apply bitwise operations on this code, like

BitArray ba1 = new BitArray(8);
byte[] a = { 255 };

ba1 = new BitArray(a);
ba1 = ba1.Not();

for (int i = 0; i < ba1.Count; i++)
{
Console.Write("{0, -6} ", ba1[i]);
}

for which everything I will get False.
Another interesting operation could be

BitArray ba1 = new BitArray(8);
BitArray ba2 = new BitArray(8);
BitArray ba3 = new BitArray(8);
byte[] a = { 155 };
byte[] b = { 78 };

ba1 = new BitArray(a);
ba2 = new BitArray(b);
ba3 = ba1.Xor(ba2);

for (int i = 0; i < ba3.Count; i++)
{
Console.Write("{0, -6} ", ba3[i]);
}

Performance, Efficiency and Storage

Storage does not really matter unless you are doing something for IoT with limited storage. Fortunately BitArray takes 1/8 of the space which is taken by bool,  because they “pack” 8 boolean values into a single byte, whereas a bool by itself will take up the whole 8-bit byte.

In performance wise, sometimes List<bool> are faster than BitArray for a given size, unless you are doing more and more bit wise operations we does in image processing. Take a look at the below code, it does not do any bit wise operation and you can notice BitArrays perform poor in this case,

class Program
{
	static void Main(string[] args)
	{
		Random r = new Random();
		r.Next(1000);

		const int N = 10000000;

		Console.WriteLine("Testing with {0} operations:", N);

		Console.WriteLine("   A BitArray (32) took {0} ms.", TestBitArray(r, N));
		Console.WriteLine("   A List<bool>(32) took {0} ms.", TestBoolArray(r, N));

		Console.Read();
	}

	static long TestBitArray(Random r, int n)
	{
		BitArray b = new BitArray(32, false);     // 40 bytes

		var sw = Stopwatch.StartNew();
		for (int i = 0; i < n; i++)
		{

			b.Set(r.Next(32), true);
			bool v = b.Get(r.Next(32));
			b.Set(r.Next(32), v);
		}
		sw.Stop();
		return sw.ElapsedMilliseconds;
	}

	static long TestBoolArray(Random r, int n)
	{
		bool[] ba = new bool[32];

		var sw = Stopwatch.StartNew();
		for (int i = 0; i < n; i++)
		{

			ba[r.Next(32)] = true;
			bool v = ba[r.Next(32)];
			ba[r.Next(32)] = v;
		}
		sw.Stop();
		return sw.ElapsedMilliseconds;
	}
}

But I wrote the below two sets of code

class Program
    {
        static void Main(string[] args)
        {
            Random r = new Random();
            r.Next(1000);

            const int N = 900000;

            Console.WriteLine("Testing with {0} operations:", N);

            Console.WriteLine("   A BitArray (32) took {0} ms.", TestBitArray(r, N));
            Console.WriteLine("   A List<bool>(32) took {0} ms.", TestBoolArray(r, N));

            Console.Read();
        }

        static long TestBitArray(Random r, int n)
        {
            BitArray b = new BitArray(32, false);     // 40 bytes
            BitArray c = new BitArray(32, false);
            BitArray d = new BitArray(32, false);
            var sw = Stopwatch.StartNew();
            for (int i = 0; i < n; i++)
            {

                b.Set(r.Next(32), true);
                bool v = b.Get(r.Next(32));
                b.Set(r.Next(32), v);
            }

            for (int i = 0; i < n; i++)
            {

                c.Set(r.Next(32), true);
                bool v = c.Get(r.Next(32));
                c.Set(r.Next(32), v);
            }

            d = b.Xor(c);
            sw.Stop();
            return sw.ElapsedMilliseconds;
        }

        static long TestBoolArray(Random r, int n)
        {
            bool[] ba = new bool[32];
            bool[] ca = new bool[32];
            bool[] da = new bool[32];

            var sw = Stopwatch.StartNew();
            for (int i = 0; i < n; i++)
            {

                ba[r.Next(32)] = true;
                bool v = ba[r.Next(32)];
                ba[r.Next(32)] = v;
            }
            for (int i = 0; i < n; i++)
            {

                ca[r.Next(32)] = true;
                bool v = ca[r.Next(32)];
                ca[r.Next(32)] = v;
            }
            for (int i = 0; i < n; i++)
            {

                da[r.Next(32)] = ba[r.Next(32)] ^ ca[r.Next(32)];
            }
            sw.Stop();
            return sw.ElapsedMilliseconds;
        }
    }

class Program
    {
        static void Main(string[] args)
        {
            Random r = new Random();
            r.Next(1000);

            const int N = 9000000;

            Console.WriteLine("Testing with {0} operations:", N);

            Console.WriteLine("   A BitArray (32) took {0} ms.", TestBitArray(r, N));
            Console.WriteLine("   A List<bool>(32) took {0} ms.", TestBoolArray(r, N));

            Console.Read();
        }

        static long TestBitArray(Random r, int n)
        {
            BitArray b = new BitArray(32, false);     // 40 bytes
            BitArray c = new BitArray(32, false);
            BitArray d = new BitArray(32, false);
            var sw = Stopwatch.StartNew();
            for (int i = 0; i < n; i++)
            {

                b.Set(r.Next(32), true);
                bool v = b.Get(r.Next(32));
                b.Set(r.Next(32), v);
            }

            for (int i = 0; i < n; i++)
            {

                c.Set(r.Next(32), true);
                bool v = c.Get(r.Next(32));
                c.Set(r.Next(32), v);
            }

            d = b.Xor(c);
            sw.Stop();
            return sw.ElapsedMilliseconds;
        }

        static long TestBoolArray(Random r, int n)
        {
            bool[] ba = new bool[32];
            bool[] ca = new bool[32];
            bool[] da = new bool[32];

            var sw = Stopwatch.StartNew();
            for (int i = 0; i < n; i++)
            {

                ba[r.Next(32)] = true;
                bool v = ba[r.Next(32)];
                ba[r.Next(32)] = v;
            }
            for (int i = 0; i < n; i++)
            {

                ca[r.Next(32)] = true;
                bool v = ca[r.Next(32)];
                ca[r.Next(32)] = v;
            }
            for (int i = 0; i < n; i++)
            {

                da[r.Next(32)] = ba[r.Next(32)] ^ ca[r.Next(32)];
            }
            sw.Stop();
            return sw.ElapsedMilliseconds;
        }
    }

notice in the first one, I have const int N = 900000;(5 zeros) and in next I have const int N = 9000000;(6 zeros) and I am getting some un-consistent performance on both. Just have a look below,
Untitled1
Untitled1
But you can see the efficiency, number if lines we have to write. Still looks better right.

Things we can try

I tried replacing some amount of arrays I used in Laplacian of Gaussion edge detection algorithm and I really did not see any very significant improvement but I have reduced the lines and complexity of my code really, it became more readable. But I could suggest re-writing SHA algorithms or other cipher algorithms that are heavily using bit operations with this BitArray.

Advertisements