首页 > 代码库 > 算法:支持重复元素的二分查找

算法:支持重复元素的二分查找

近几天在处理的一个项目,需要频繁对一些有序超大集合进行目标查找,二分查找算法是这类问题的最优解。但是java的Arrays.binarySearch()方法,如果集合中有重复元素,而且遇到目标元素正好是这些重复元素之一,该方法只能返回一个,并不能将所有的重复目标元素都返回,没办法,只能自造轮子了。

先复习下二分查找的经典算法:

 1     private int binarySearch1(Integer[] A, Integer x) { 2         int low = 0, high = A.length - 1; 3         while (low <= high) { 4             int mid = (low + high) / 2; 5             if (A[mid].equals(x)) { 6                 return mid; 7             } else if (x < A[mid]) { 8                 high = mid - 1; 9             } else {10                 low = mid + 1;11             }12         }13         return -1;14     }

思路很简单,先定位到中间元素,如果中间元素比目标元素大,则扔掉后一半,反之扔掉前一半,如果正好一次命中,直接返回。

略做改进:

 1     private List<Integer> binarySearch2(Integer[] A, Integer x) { 2         List<Integer> result = new ArrayList<Integer>(); 3         int low = 0, high = A.length - 1; 4         while (low <= high) { 5             int mid = (low + high) / 2; 6             if (A[mid].equals(x)) { 7                 if (mid > 0) { 8                     //看前一个元素是否=目标元素 9                     if (A[mid - 1].equals(x)) {10                         for (int i = mid - 1; i >= 0; i--) {11                             if (A[i].equals(x)) {12                                 result.add(i);13                             } else break;14                         }15                     }16                 }17                 result.add(x);18                 if (mid < high) {19                     //看后一个元素是否=目标元素20                     if (A[mid + 1].equals(x)) {21                         for (int i = mid + 1; i <= high; i++) {22                             if (A[i].equals(x)) {23                                 result.add(i);24                             } else break;25                         }26                     }27                 }28                 return result;29             } else if (x < A[mid]) {30                 high = mid - 1;31             } else {32                 low = mid + 1;33             }34         }35         return result;36 37     }

思路:命中目标后,看下前一个紧挨着的元素是否也是要找的元素,如果是,则顺着向前取,直到遇到不等于目标元素为止。然后再看后一个紧挨着的元素,做类似处理。

测试:

1         Integer[] A = new Integer[]{1, 2, 3, 4, 5, 5, 5, 6, 7, 8, 9};2 3         System.out.println("binarySearch1 => ");4         System.out.println(binarySearch1(A, 5));5 6         System.out.println("binarySearch2 => ");7         System.out.println(binarySearch2(A, 5));

binarySearch1 =>
5
binarySearch2 =>
[4, 5, 6]

从返回的下标值看,都在预期之中,但是事情并未到此止步,通常要查找的列表元素,并不是数值这么简单,一般是一些复杂的对象实例,为了做到通用,得弄成一个泛型版本:

 1    private <T> List<Integer> binarySearch4(List<T> A, T x, Comparator<? super T> comparator) { 2         List<Integer> result = new ArrayList<Integer>(); 3         int low = 0, high = A.size() - 1; 4         while (low <= high) { 5             int mid = (low + high) / 2; 6             int temp = comparator.compare(x, A.get(mid)); 7             if (temp == 0) { 8                 if (mid > 0) { 9                     if (comparator.compare(x, A.get(mid - 1)) == 0) {10                         for (int i = mid - 1; i >= 0; i--) {11                             if (comparator.compare(A.get(i), x) == 0) {12                                 result.add(i);13                             } else break;14                         }15                     }16                 }17                 result.add(mid);18                 if (mid < high) {19                     if (comparator.compare(x, A.get(mid + 1)) == 0) {20                         for (int i = mid + 1; i <= high; i++) {21                             if (comparator.compare(x, A.get(i)) == 0) {22                                 result.add(i);23                             } else break;24                         }25                     }26                 }27                 return result;28 29             } else if (temp < 0) {30                 high = mid - 1;31             } else {32                 low = mid + 1;33             }34         }35 36         return result;37     }

为了比较二个复杂对象实例的大小,引入了Comparator接口,可以根据业务需要,则调用者自定义比较规则。

测试一下:

先定义一个业务对象类AwbDto:

 1 package com.cnblogs.yjmyzz.test.domain; 2  3 /** 4  * Created by jimmy on 15/1/8. 5  */ 6 public class AwbDto { 7  8  9     /**10      * 运单号11      */12     private String awbNumber;13 14     /**15      * 始发站16      */17     private String originAirport;18 19     public AwbDto(String awbNumber, String originAirport) {20         this.awbNumber = awbNumber;21         this.originAirport = originAirport;22     }23 24     public String getAwbNumber() {25         return awbNumber;26     }27 28     public void setAwbNumber(String awbNumber) {29         this.awbNumber = awbNumber;30     }31 32     public String getOriginAirport() {33         return originAirport;34     }35 36     public void setOriginAirport(String originAirport) {37         this.originAirport = originAirport;38     }39 }

还需要定义AwbData比较大小的业务规则,假设:只要运单号相同,则认为相等(即:用运单号来区分对象大小)

1     private class AwbDtoComparator implements Comparator<AwbDto> {2 3         @Override4         public int compare(AwbDto x, AwbDto y) {5             return x.getAwbNumber().compareTo(y.getAwbNumber());6         }7     }

测试代码:

 1         List<AwbDto> awbList = new ArrayList<AwbDto>(); 2         awbList.add(new AwbDto("112-10010011", "北京")); 3         awbList.add(new AwbDto("112-10010022", "上海")); 4         awbList.add(new AwbDto("112-10010033", "天津")); 5         awbList.add(new AwbDto("112-10010044", "武汉")); 6         awbList.add(new AwbDto("112-10010044", "武汉")); 7         awbList.add(new AwbDto("112-10010055", "广州")); 8  9         AwbDtoComparator comparator = new AwbDtoComparator();10 11         AwbDto x = new AwbDto("112-10010044", "武汉");12 13 14         System.out.println("binarySearch4 => ");15         System.out.println(binarySearch4(awbList, x, comparator));

binarySearch4 =>
[3, 4]

测试结果,一切顺利,皆大欢喜,可以去休息了。

算法:支持重复元素的二分查找