tag:blogger.com,1999:blog-29331675.post3117608012605768331..comments2024-03-05T17:37:00.995+01:00Comments on The Delphi Geek: Fast binary searching, part 2: Hard datagabr42http://www.blogger.com/profile/06903558857617342477noreply@blogger.comBlogger7125tag:blogger.com,1999:blog-29331675.post-43704234133850991132012-04-06T22:42:20.189+02:002012-04-06T22:42:20.189+02:00Size is ofcourse a major player when the amount of...Size is ofcourse a major player when the amount of data is very large. But you could gain some speed from using vectored "waypoints" computed while you sort your data.<br /><br />For the other point, unless your data must be sorted, you also gain a lot in this area using hashing due to the fact that you don't need to sort it.Atle Smelvaerhttps://www.blogger.com/profile/16539063503419922065noreply@blogger.comtag:blogger.com,1999:blog-29331675.post-13583310515194834532012-04-06T08:11:07.594+02:002012-04-06T08:11:07.594+02:00Hashing is good but it cannot solve all problems. ...Hashing is good but it cannot solve all problems. In my original problem, I barely had enough memory to store very packed representation of the data. There was no way to put a hash of it in the memory.<br /><br />Sometimes you also don't know what is in your data but you are still searching in it. If this is the case, linear search of any variety is much faster than building a hash of everything in your data.gabr42https://www.blogger.com/profile/06903558857617342477noreply@blogger.comtag:blogger.com,1999:blog-29331675.post-75815055463264144942012-04-06T00:22:43.103+02:002012-04-06T00:22:43.103+02:00First of all, good hashing will most probably alwa...First of all, good hashing will most probably always give you better speed for this.<br /><br />To get you into some other areas within binary search: if your data don't change often you could have vectored "waypoints" computed for your data giving a better average of your section/subsection and in that way give you more precise jumps. Once your subsection involves less than 64 elements you could switch to ordinary binary search. Also compute average without "special case" items. Let's say you have 1,3,7,9, 100. In this sample 100 is a "special case" and should be dropped.Atle Smelvaerhttps://www.blogger.com/profile/16539063503419922065noreply@blogger.comtag:blogger.com,1999:blog-29331675.post-65561975078770109882007-01-27T18:22:00.000+01:002007-01-27T18:22:00.000+01:00I agree fully.
Maybe I'll publish run times somet...I agree fully.<br /><br />Maybe I'll publish run times sometime - but that is harder to measure as I must keep one computer free of all other tasks for one hour (times the number of different scenario I want to measure).gabr42https://www.blogger.com/profile/06903558857617342477noreply@blogger.comtag:blogger.com,1999:blog-29331675.post-66398325008888961112007-01-27T17:46:00.000+01:002007-01-27T17:46:00.000+01:00Less search steps stop being a good thing if it ta...Less search steps stop being a good thing if it takes longer than more search steps.<br /><br />Runtimes here would be more valuable than # of steps.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-29331675.post-49358627373913647142007-01-27T07:38:00.000+01:002007-01-27T07:38:00.000+01:00Yes, that is also a quite effective approach.
Was...Yes, that is also a quite effective approach.<br /><br />Wasn't usable in my case, though.gabr42https://www.blogger.com/profile/06903558857617342477noreply@blogger.comtag:blogger.com,1999:blog-29331675.post-5286877138060898432007-01-26T23:46:00.000+01:002007-01-26T23:46:00.000+01:00I enjoyed reading your blog on improving the binar...I enjoyed reading your blog on improving the binary search, something which has always been dear to my heart.<br /><br />I have found the best way to improve the binary search is to find some way to limit what you are searching and plan in advance.<br /><br />For example, if I have to perform searches against a national database of addresses, it makes more sense to separate my files by SCF (the first 3 digits of a zip code) and then perform binary searches through that. To search by state, I just create a lookup table for each state which lists the SCF's in that state. <br /><br />Other performance gains can be made by keeping the search data in a separate file with only a record pointer to where the actual data is located. A simple index using this method then can be read in "chunks". Each chunk would have the high and low keys first followed by the rest of the chunk in order. The end result is less disk activity when the search finds the correct page.Anonymousnoreply@blogger.com