3. Dr. Hussien M.
Sharaf
3
Structure of Indexes
Indexes must be sorted on ascending or descending
order with respect to a (one or more ) field(s).
CompanyName offset
Google 211Record1
n
n
IBM 0Record2 n
ITE 643Record3 n
Microsoft 462Record4 n
Apple Mac 985
New
record n
4. Dr. Hussien M.
Sharaf
4
Given the following file, build an index for
company names.
File contents:
ID, Name, Phone, commentn
5, Google, 2133 7710, 5 branchesn
7, ITE, 24413 9900, 1 branchn
8, Microsoft, 2789 0054, 9 branchesn
9, IBM, 24413 9900, 5 branchesn
10, Apple Mac, 2567 9876, 8 branchesn
5. Dr. Hussien M.
Sharaf
5
Step:1 write down the offset of each record .
Company Name , offset
Google , 26
ITE ,60
Microsoft , 90
IBM ,127
Apple Mac , 158
6. Dr. Hussien M.
Sharaf
6
Step:2 Reorder the records by company name.
Company Name , offset
Apple Mac , 158
Google , 26
IBM ,127
ITE ,60
Microsoft , 90
7. Dr. Hussien M.
Sharaf
7
int LinearSearch(TargetName)
◦ For each element in Index collection:
◦ {
If Index[i].CompanyName==TargetName
{
Return Index[i].offset
}
◦ }
◦ Return -1 //not found
8. Dr. Hussien M.
Sharaf
8
int BinarySearch(TargetName)
{ int low=0,high=Index.size();
Midpoint=(high+low)/2;
while (low<=high)
◦ {CurrentRecordIndex=Index[Midpoint];
◦ If
(CurrentRecordIndex.CompanyName==TargetN
ame)
Return CurrentRecordIndex.offset
9. Dr. Hussien M.
Sharaf
9
◦ If (CurrentRecordIndex.CompanyName==TargetName)
Return CurrentRecordIndex.offset
◦ If (CurrentRecordIndex.CompanyName<=Targetname)
Search in the right half: low= Midpoint+1
◦ Else
Search in the left half: high= Midpoint-1
◦ } //end while
◦ Return -1 //not found
10. Dr. Hussien M.
Sharaf
10
◦ Searching should be on a collection loaded
in main memory
◦ Binary search O(logn) is faster than Linear
search O(n)
◦ The penalty of using Binary search is that the
collection must be sorted.
◦ Is this a problem for an Index?
11. Implement linear search on IndexesCollection class.
Implement Binary search on IndexesCollection
class.
Dr. Hussien M.
Sharaf
11