首页 > 代码库 > LeetCode 第二题,Median of Two Sorted Arrays

LeetCode 第二题,Median of Two Sorted Arrays

题目再现

There are two sorted arrays A and B of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

题意解析

题目意思是给两个大小为m,n的有序数组(m,n可能为0),要求找出这两个数组的中位数.并且程序的时间复杂度必须不能超过O(log(m+n)).
这道题的给我的第一感觉是不难,只是有复杂度的要求而已.
同时想到的最直接的方法就是,把两个数组合成一个有序数组,然后直接找出中位数.
如何合并两个有序数组呢?(这个话题值得研究一下,稍后查下资料什么的.)
首先申请一个m+n的数组空间,分别将两个数组拷贝进入该空间,然后,,使用sort函数排序,然后直接寻找输出即可.代码如下:
class Solution {
public:
    double findMedianSortedArrays(int A[], int m, int B[], int n) {
        // Start typing your C/C++ solution below
        // DO NOT write int main() function
        int *a=new int[m+n];
        
        memcpy(a,A,sizeof(int)*m);
        memcpy(a+m,B,sizeof(int)*n);
        
        sort(a,a+n+m);
        
        double median=(double) ((n+m)%2? a[(n+m)>>1]:(a[(n+m-1)>>1]+a[(n+m)>>1])/2.0);
        
        delete a;
        
        return median;
    }
};

而没想到的是,居然通过了....................觉得自己在作弊啊,而且时间复杂度还没有达到要求O(m+n).........................这都给过了..................ORZ.


这种方法的python代码也很好写,不过,,,貌似leetcode不支持python的内置方法,因为我使用sorted函数的时候,显示编译不过.......


然后就是第二种的解法了。
我们可以发现,现在我们是不需要“排序”这么复杂的操作的,因为我们仅仅需要第k大的元素。我们可以用一个计数器,记录当前已经找到第m大的元素了。同时我们使用两个指针pA和pB,分别指向A和B数组的第一个元素。使用类似于merge sort的原理,如果数组A当前元素小,那么pA++,同时m++。如果数组B当前元素小,那么pB++,同时m++。最终当m等于k的时候,就得到了我们的答案——O(k)时间,O(1)空间。
测试代码如下:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
class Solution
{
private:
	double findKth(int a[], int m, int b[], int n, int k)
	{
		int i = 0;
		int j = 0;
		int index = 1;
		int kth;
		if (m == 0)
			return b[k-1];
		if (n == 0)
			return a[k-1];
		if (k ==1)
			return a[0]>b[0] ? b[0] : a[0];
        if (k == 2)
            return (a[0] + b[0]) /2.0;

		while(index <= k && i < m && j < n)
		{
			if( a[i] >= b[j])
			{
				index ++;
				kth = b[j];
				j ++;
			}
			else
			{
				index ++;
				kth = a[i];
				i ++;
			}
		}

		if( index < k && j == n)
        {
            kth = a[i+k-index];
        }
        if (index < k && i == m)
            kth = b[j+k-index];

		return kth;
	}
public:
	double findMedianSortedArrays(int A[], int m, int B[], int n)
	{
		int total = m + n;
		// totla is even OR odd ?
		if (total & 0x1)           // even
			return findKth(A, m, B, n, total / 2 + 1);
		else   // odd
			return (findKth(A, m, B, n, total / 2)
					+ findKth(A, m, B, n, total / 2 + 1)) / 2;
	}
};

int main()
{
	Solution s1;
	int A[] = {6,7,8,9};
	int B[] = {5,6};
	cout << s1.findMedianSortedArrays(A, 4, B, 2) << endl;
	return 0;
}



不过,,,,在leetcode里面就是得不到和测试一样的结果。。。。。。。。。。。。。
然后是犀利的方法:
可以考虑从k入手。如果我们每次都能够剔除一个一定在第k大元素之前的元素,那么我们需要进行k次。但是如果每次我们都剔除一半呢?所以用这种类似于二分的思想,我们可以这样考虑:(一下来源于别人)
Assume that the number of elements in A and B are both larger than k/2, and if we compare the k/2-th smallest element in A(i.e. A[k/2-1]) and the k-th smallest element in B(i.e. B[k/2 - 1]), there are three results:
(Becasue k can be odd or even number, so we assume k is even number here for simplicy. The following is also true when k is an odd number.)
A[k/2-1] = B[k/2-1]
A[k/2-1] > B[k/2-1]
A[k/2-1] < B[k/2-1]
if A[k/2-1] < B[k/2-1], that means all the elements from A[0] to A[k/2-1](i.e. the k/2 smallest elements in A) are in the range of k smallest elements in the union of A and B. Or, in the other word, A[k/2 - 1] can never be larger than the k-th smalleset element in the union of A and B.

Why?
We can use a proof by contradiction. Since A[k/2 - 1] is larger than the k-th smallest element in the union of A and B, then we assume it is the (k+1)-th smallest one. Since it is smaller than B[k/2 - 1], then B[k/2 - 1] should be at least the (k+2)-th smallest one. So there are at most (k/2-1) elements smaller than A[k/2-1] in A, and at most (k/2 - 1) elements smaller than A[k/2-1] in B.So the total number is k/2+k/2-2, which, no matter when k is odd or even, is surly smaller than k(since A[k/2-1] is the (k+1)-th smallest element). So A[k/2-1] can never larger than the k-th smallest element in the union of A and B if A[k/2-1]<B[k/2-1];
Since there is such an important conclusion, we can safely drop the first k/2 element in A, which are definitaly smaller than k-th element in the union of A and B. This is also true for the A[k/2-1] > B[k/2-1] condition, which we should drop the elements in B.
When A[k/2-1] = B[k/2-1], then we have found the k-th smallest element, that is the equal element, we can call it m. There are each (k/2-1) numbers smaller than m in A and B, so m must be the k-th smallest number. So we can call a function recursively, when A[k/2-1] < B[k/2-1], we drop the elements in A, else we drop the elements in B.


We should also consider the edge case, that is, when should we stop?
1. When A or B is empty, we return B[k-1]( or A[k-1]), respectively;
2. When k is 1(when A and B are both not empty), we return the smaller one of A[0] and B[0]
3. When A[k/2-1] = B[k/2-1], we should return one of them

In the code, we check if m is larger than n to garentee that the we always know the smaller array, for coding simplicy.
double findKth(int a[], int m, int b[], int n, int k)
{
	//always assume that m is equal or smaller than n
	if (m > n)
		return findKth(b, n, a, m, k);
	if (m == 0)
		return b[k - 1];
	if (k == 1)
		return min(a[0], b[0]);
	//divide k into two parts
	int pa = min(k / 2, m), pb = k - pa;
	if (a[pa - 1] < b[pb - 1])
		return findKth(a + pa, m - pa, b, n, k - pa);
	else if (a[pa - 1] > b[pb - 1])
		return findKth(a, m, b + pb, n - pb, k - pb);
	else
		return a[pa - 1];
}

class Solution
{
public:
	double findMedianSortedArrays(int A[], int m, int B[], int n)
	{
		int total = m + n;
		if (total & 0x1)
			return findKth(A, m, B, n, total / 2 + 1);
		else
			return (findKth(A, m, B, n, total / 2)
					+ findKth(A, m, B, n, total / 2 + 1)) / 2;
	}
};

另外一个方法:
首先转成求A和B数组中第k小的数的问题, 然后用k/2在A和B中分别找。比如k = 6, 分别看A和B中的第3个数, 已知 A1 < A2 < A3 < A4 < A5... 和 B1 < B2 < B3 < B4 < B5..., 如果A3 <= B3, 那么第6小的数肯定不会是A1, A2, A3, 因为最多有两个数小于A1, 三个数小于A2, 四个数小于A3。B3至少大于5个数, 所以第6小的数有可能是B1 (A1 < A2 < A3 < A4 < A5 < B1), 有可能是B2 (A1 < A2 < A3 < B1 < A4 < B2), 有可能是B3 (A1 < A2 < A3 < B1 < B2 < B3)。那就可以排除掉A1, A2, A3, 转成求A4, A5, ... B1, B2, B3, ...这些数中第3小的数的问题, k就被减半了。每次都假设A的元素个数少, pa = min(k/2, lenA)的结果可能导致k == 1或A空, 这两种情况都是终止条件。 

class Solution:
    # @return a float
     
    def getMedian(self, A, B, k):
        # return kth smallest number of arrays A and B, assume len(A) <= len(B)
        lenA = len(A); lenB = len(B)
        if lenA > lenB: return self.getMedian(B, A, k)
        if lenA == 0: return B[k-1]
        if k == 1: return min(A[0], B[0])
        pa = min(k/2, lenA); pb = k - pa
        return self.getMedian(A[pa:], B, k - pa) if A[pa - 1] <= B[pb - 1] else self.getMedian(A, B[pb:], k - pb)
     
    def findMedianSortedArrays(self, A, B):
        lenA = len(A); lenB = len(B)
        if (lenA + lenB) % 2 == 1: 
            return self.getMedian(A, B, (lenA + lenB) / 2 + 1)
        else:
            return 0.5 * ( self.getMedian(A, B, (lenA + lenB) / 2) + self.getMedian(A, B, (lenA + lenB) / 2 + 1) )

。。。。。。。。。。。。。。
这道题不太简单。。。。