The main procedure:

The Trie data structure:

And the heap of fixed size:

Fizz Buzz

August 22, 2015 happyduduAlogrithmLeave a comment

初阶：给一个区间[a,b]，对于区间中每个数，如果是3的倍数，输出Fizz，如果是5的倍数，输出Buzz，如何能同时是3和5的倍数，输出FizzBuzz。
A：for区间中的每个数，先判断能否被15整除，然后判断被3整除，然后判断被5整除。注意跳过0。

进阶：假如有很多个除数和对应的单词被放在一个Hash表中（上述例子为{3: “Fizz”, 5:”Buzz”}）。请问有什么方法可以加快运算效率？

进阶：时间最优的方法-开一个区间那么大的数组，数组的每个元素是一个字符串，对于每个除数，枚举他在区间中的所有倍数，然后将单词加入数组中对应位置。时间复杂度O(n)，n为答案的个数。空间复杂度O(b-a)
空间最优的方法-上述方法需要开辟一个大数组，在区间很大，而除数不多的情况下并不划算。对于每个除数，计算出每个除数的第一个大于等于左区间的倍数，然后放入一个堆。每次从堆中取出一个数，讲结果记录，并讲该结果加上对应除数，放回堆。时间复杂度O(nlogm)，m为除数个数。空间复杂度O(m)。
Preconditions: n is a non-negative integer, k is a positive integer, and each divisor supplied is a positive integer.

Create a heap priority queue containing of (multiple, divisor, string) tuples, ordered by multiple (and subsequently divisor in case of a tie). Initialize it by using (d, d, s) for each (d, s) input divisor/message pair.

For each iteration with iteration number i, examine the front of the queue (m, d, s). If i < m then just print i and move onto the next iteration. Otherwise i = m (by construction). Print s, pop front of queue, and add insert (m + d, d, s). Restart the current i iteration.

扔棋子

August 22, 2015 happyduduAlogrithmLeave a comment

初阶：有一个100层高的大厦，你手中有两个相同的玻璃围棋子。从这个大厦的某一层及更高的层扔下围棋子就会碎，用你手中的这两个玻璃围棋子，找出一个最优的策略（扔最少的次数），来得知那个临界层面。

进阶：如果大厦高度是N层，你有K个棋子，请问最少需要扔几次可以知道得临界层？
答：

初阶：推导过程如下：首先在第x层仍一个玻璃围棋子，如果碎了，则利用另一个玻璃围棋子依次从1~x-1层试探，查找临界层；如果第x层未碎，则在第x+(x-1)=2x-1层仍一个玻璃围棋子，如果碎了，则利用另一个玻璃围棋子一次从x+1~2x-2层查找临界层，…,这样，可得到表达式：x+x-1+…+1>=100，可得到x=14，即：
先从14层扔（碎了试1-13）
再从27层扔（碎了试15-26）
再从39层扔（碎了试28-38）
再从50层扔（碎了试40-49）
再从60层扔（碎了试51-59）
再从69层扔（碎了试61-68）
再从77层扔（碎了试70-76）
再从84层扔（碎了试78-83）
再从90层扔（碎了试85-89)
再从95层扔（碎了试91-94）
再从99层扔（碎了试96-98）
最后从100层扔（根据题意一定会碎，也可以不扔了）
最坏情况下扔14次。
f(x)=max(x, f(N-x)+1)

再谈大楼扔鸡蛋的问题

进阶：
动态规划。设F[N][K]为N层楼，K个棋子最少仍几次。有状态转移方程如下：
楼层有n个，球有K 个
F（n,k） = min{max{f(r,k-1),f(r+1,k)}+1, 1=<r<=n}

这个的意思就是F（n,k）的含义 n层楼 k个球最少次数确定临界层

其实 r就是用来表示我从哪一层开始探测，当然需要考虑从第一层到第n层中的某一个来开始处理。

情况假设从第r层开始扔

如果球碎了，那么一定在1 – r层必有一层为临界剩k-1个球

如果球未碎，那么一定在r+1-n层必有一层为临界剩 k个球
当然在计算F（n,k）是我需要按最坏情况考虑，如果在 1-r-1层扔的次数多于 r-n层我需要记录的次数当然为f(r,k-1)反之亦然。

时间复杂度O(N^2 * K)
http://www.haodaima.net/art/2817797

Majority Element I && II

August 20, 2015 happyduduLeave a comment

Majority Element I — Moore Voting

   public int majorityElement(int[] nums) {
        if (nums.length==0) return 0;
        if (nums.length<=2) return nums[0];
        int maj_idx=0;
        int count=0;
        for(int i=0;i<nums.length;i++){
            if (count==0){
                maj_idx=i;
            }
            if (nums[maj_idx]==nums[i]){
                count++;
                if (count>nums.length/2) return nums[maj_idx];
            }else{
                count--;
            }
            
        }
        return nums[maj_idx];
    }

Majority Element II

 public List<Integer> majorityElement(int[] nums) {
        List<Integer> res=new ArrayList<Integer>();
        if(nums.length==0) return res;
        int c1=0; int c2=0; int count1=0;
        int count2=0;
        for(int i=0;i<nums.length;i++){
            if (count1==0){
                c1=i; 
            }else if (count2==0){
                c2=i;  
            }
            int cur=nums[i];
            if (cur==nums[c1]){
                count1++;
            }else if (cur==nums[c2]){
                count2++;
                
            }else{
                count1--; count2--;
            }
            
        }
        count1=0; count2=0;
        for(int i=0;i<nums.length;i++){
            int cur=nums[i];
            if (cur==nums[c1]){
                count1++;
                 
            }else if (cur==nums[c2]){
                count2++; 
               
            } 
            
        }
        if (count1>nums.length/3) res.add(nums[c1]);
         if (count2>nums.length/3) res.add(nums[c2]);
         return res;
    }

find kth number – 第k大的数

August 20, 2015August 20, 2015 happyduduLeave a comment

问题等价于求两个array的第k=(m+n)/2（假设m和n分别是两个数组的元素个数）大的数是多少。基本思路是每次通过查看两个数组的第k/2大的数(假设是A[k/2],B[k/2])，如果两个A[k/2]=B[k/2]，说明当前这个数即为两个数组剩余元素的第k大的数，如果A[k/2]>B[k/2], 那么说明B的前k/2个元素都不是我们要的第k大的数，反之则排除A的前k/2个，如此每次可以排除k/2个元素，最终k=1时即为结果。总的时间复杂度为O(logk),空间复杂度也是O(logk)，即为递归栈大小。

 
private int helper(int A[], int B[], int i, int i2, int j, int j2, int k)
{
 int m = i2-i+1;
 int n = j2-j+1;
 if(m>n)
    return helper(B,A,j,j2,i,i2,k);
 if(m==0)
     return B[j+k-1];
 if(k==1)
    return Math.min(A[i],B[j]);
 int posA = Math.min(k/2,m);
 int posB = k-posA;
 if(A[i+posA-1]==B[j+posB-1])
     return A[i+posA-1];
 else if(A[i+posA-1]<B[j+posB-1])
     return helper(A,B,i+posA,i2,j,j+posB-1,k-posA);
 else
     return helper(A,B,i,i+posA-1,j+posB,j2,k-posB);
}

Maven — Summary

August 12, 2015August 12, 2015 happyduduMaven, ToolLeave a comment

Maven Phases:

validate -> compile -> test-> package->integration-test -> verify -> install->deploy

Mavenrepository.com to find artifact for dependency and add to pom.xml

<groupId>junit</groupId>

<artifactId>junit</artifactId>

<scope>test</scope> — default is compile

</dependency>

Deadlock & deadlock prevention

August 7, 2015August 7, 2015 happyduduCOncurrency, javaLeave a comment

Deadlock: A deadlock is when two or more threads are blocked waiting to obtain locks that some of the other threads in the deadlock are holding.

Deadlock prevention:

Lock Ordering: make sure all locks are always taken in the same order by any thread.
Lock timeout: put a timeout on lock attempts meaning that a thread trying to obtain a lock will only try for so long before giving up.
Deadlock Detection: a heavier deadlock prevention mechanism aimed at cases in which lock ordering isn’t possible, and lock timeout isn’t feasible.
- One possible action is to release all locks,backup, wait a random amount of time and then retry.
- one option is to determine or assign a priority of the threads so that only one(or a few) thread backs up.

Structured data vs unstructured data

August 3, 2015August 4, 2015 happyduduLeave a comment

Structured data (as explained succinctly in Big Data Republic’s video) is information, usually text files, displayed in titled columns and rows which can easily be ordered and processed by data mining tools.

Unstructured data: anything else

Q: When should I use Amazon DynamoDB vs Amazon S3?

Amazon DynamoDB stores structured data, indexed by primary key, and allows low latency read and write access to items ranging from 1 byte up to 64KB. Amazon S3 stores unstructured blobs and suited for storing large objects up to 5 TB. In order to optimize your costs across AWS services, large objects or infrequently accessed data sets should be stored in Amazon S3, while smaller data elements or file pointers (possibly to Amazon S3 objects) are best saved in Amazon DynamoDB.

^{— http://aws.amazon.com/dynamodb/faqs/#When_should_I_use_Amazon_DynamoDB_vs_Amazon_S3}