The "Naive" approach for this problem is to just run 2 loops where you take each of the elements of a first array and check it against element of a second array.
bool NaiveIntersect(List<int> setOne, List<int> setTwo)
{
for(int i1=0;i1<setOne.Count;i1++)
{
for(int i2=0;i2<setTwo.Count;i2++)
{
if (setTwo[i2]==setOne[i1])
{
return true;
}
}
}
return false;
}
Obviously this is a not optimal solution with complexity of O(n*m) .NET framework provide number of build in classes what give ability to test overlap of two arrays.
The list I came up with:
- LINQ intersect extension method
- Hashset collection
- BitArray collection
bool LinqOverlap(List<int> setOne, List<int> setTwo)HashSet it's a hashing data structure similar to HashTable or Dictionary collection, the main difference is what HashSet don't have key and values it has only values ,and hash is performed on them. In hash set you to find an element you just need to compute the hash of this element and get it from appropriate it's much faster that to iterate over a collection. HashSet collection provides a convenient Overlaps method to test this exact scenario so code is very short and obvious.
{
return setOne.Intersect(setTwo).Any();
}
bool HashSetIntersect(HashSet<int> set1, HashSet<int> set2)Another interesting collection is BitArray , underling data structure of BitArray is simple array of integers. The main strange of BitArray is in ability to perform bit operations across two arrays of number, so in our case overlap method will look like this. We perform bitwise AND operation and check if one of the bits in the intersect is set to true.
{
return set1.Overlaps(set2);
}
bool BitArrayOverlap(BitArray set1, BitArray set2)
{
var intersect=set1.And(set2);
for(int j=0;j<intersect.Length;j++)
{
if(intersect[j])
return true;
}
return false;
}
I took the code above and run it on 2 arrays of integers containing 125 elements each for 100,000 times. So lets se who is the winner
NaiveDoubleLoop - 7995
Linq- 1915
HashSet Overlap - 688
BitArray - 38
As you can see from the table the obvious winner is a BitArray. And it's true for a general case , but in some special case we can get much better performance , lets talk about one more method , it's an optimization of a BitArray method. It suited for cases where sets of integers we going to test for overlap can contain only values we will know in advance and the list of this values is small (couple of hundreds). If we have only 64 possible values we can just assign a bit mask for each of the possible integers values in our sets so a set will be represented by a single long value and testing for overlap is just a binary AND between 2 longs. If we have more that 64 possible values it's possible to extend this method by giving each value in addition to bit mask another parameter that will tell us for what group it belongs so our sets will be a short array of long and we will need to perform AND operation on each pair of values in this arrays.
Here is the code that shows the implementation
bool BinaryOverlap(long[] setOne, long[] setTwo)So what about performance you say ? For 192 values it's performs 100,000 overlap tests in 1 ms. This approach will not fit every scenario , we can't always control the integers that come to us but if we can it will give us blazingly fast performance.
{
return
((setOne[0] & setTwo[0])
(setOne[1] & setTwo[1])
(setOne[2] & setTwo[2])) != 0 ;
}