Vinodchandran Variyam, professor in the School of Computing, and a group of colleagues turned an idea sparked during a 2019 dinner conversation into an algorithm that could potentially change mass data analysis by addressing a problem that has challenged computer scientists for years.
The Distinct Elements Problem is the algorithmic task of approximating the number of unique data elements in instances when it may be impractical to store or process the entire dataset. Its applications span many fields, including network traffic analysis, database management, marketing, bioinformatics and text analysis. It could also be used for a variety of purposes, such as fraud detection, in which an efficient algorithm could quickly flag unusual but not immediately recognizable patterns that deviate from the expected or previous historical patterns.
The new algorithm developed by Variyam and his colleagues uses a sampling strategy that reduces memory requirements, making it highly scalable, which is essential in computing environments where large amounts of data often must be processed very quickly.
 
 
 
 
