1. 設(shè)計文件系統(tǒng)
2. 數(shù)據(jù)結(jié)構(gòu)for spreadsheet
3. 一個app需要用cache,怎么實現(xiàn)thread safe
4. social network, billions id, every id has about 100 friends roughly, what is
max connections between any two ppls. write algorithm to return min
connections between two ids: int min_connection(id1, id2)
you can call following functions
expand(id) return friends list of id
expandall(list) return friends union of all the ids in the list
intersection(list1, list2) return intersection
removeintersection(list1, list2)
5. Open google.com, you type some words in the edit box to search something, it will return lots of search results. Among these returning results (that is, website link), you may only CLICK some results that are interesting to you. The system will record the “CLICK”action. Finally, you will have the search results (i.e. url) and “CLICK” informatin at hand.
Question: how do you find the similarity of these searching?
6. 如何找出最熱門的話題(根據(jù)tweets)。如果一個話題一直熱門,我們不想考慮怎么辦
7. Discuss design challenges of a distributed web crawler running on commercial PCs. How to utilize network bandwidth of those PCs efficiently?
8. Design a site similar to tinyurl.com
9. large log file,含有 customer id, product id, time stamp想得到在某一天中某個custom看網(wǎng)頁的次數(shù)1. 足夠memory 2. limited memory
10. 設(shè)計一個actor和movie的數(shù)據(jù)庫的schema, 支持從movie得到它的actors和從actor得到ta出現(xiàn)過的moive (Google, phone, 2006)
11. 某建筑有五十層高,打算裝倆電梯,設(shè)計該電梯系統(tǒng)
12. how to design facebook’s newsfeed?
13. 一個文件里n行m列,每行是一個record,每列一個feature,你時不時要按不同feature排序和查找。不能用數(shù)據(jù)庫,文件大小內(nèi)存能裝下,數(shù)據(jù)結(jié)構(gòu)和算法不限,語言不限,給出你最好的辦法。
14. Design online game
15. static 變量用來在整個class中共享數(shù)據(jù).基于此,各種synchronization技術(shù), 以及busy waiting的優(yōu)缺點,啥時候要用基于busy waiting的 spinlock主要是基于性能的探討。 如果有一個應(yīng)用程序運行時沒有達到timing constraint, 你如何去分析問題出在哪兒, 可以用什么工具或者技術(shù)。
16. 設(shè)計題, 有一個多臺機器構(gòu)成的cluster。 現(xiàn)在有大量公司的數(shù)據(jù)文件 (并有多個備份)。 如果設(shè)計一個算法,使得每臺機器盡量均衡的使用,并且 每個公司文件的不同copy不能存在于同一臺機器上。主要的Idea就是用round-robin的方式assign每個公司的原數(shù)據(jù)文件到一臺機器,再結(jié)合使用hashtable。 Interviewer提到我的解法正是他現(xiàn)在在使用的解法。
17. Design a class providing lock function which provide lock only if it sees there are no possible deadlocks.
18. 設(shè)計一個分布式文件系統(tǒng),給定path name,可以讀寫文件。具體的system design這里就不提了。其中一個細節(jié)是,給定path name,怎么知道哪個node擁有這個文件。我提出需要實現(xiàn)一個lookup function,它可以是一個hash function,也可以是一個lookup table。如果是lookup table,為了讓所有client sync,可以考慮額外做一個lookup cluster。然后Interviewer很糾結(jié),既然可以用hash function,為什么還搞得那么復(fù)雜。我就告訴他hash function的缺點。假定一開始有N個node,hash function把M個文件uniformly distribute到N個node上。某天發(fā)現(xiàn)capacity不夠,加了一個node。首先,要通知所有的client machine,configuration 改變了。如果不想重啟client machine的process,這不是一個trivial job。其次,文件到node的mapping也變了。比如,本來按照hash function,一個文件是放在node 1。加了一個node 后,它可能就map到node 2了。平均來說,N/(N+1)的文件需要move到新的node。這個data migration還是很大的。然后我就提出一些hash function的design,可以減少data migration。最后他提了一個問題,說要實現(xiàn)一個function,要統(tǒng)計distributed file system所有目錄的大小。前提是,一個目錄下的文件可能放在不同的node上。我說這個不就是在每個node上統(tǒng)計,然后發(fā)到一個merge嗎。他說對,但是又問用什么data structure來表示。我說這就是hash table,key就是directory name,value就是大小。因為directory本身是樹結(jié)構(gòu),這個hash table的key可以用tree來組織。最后讓我實現(xiàn)一個function,把我說得這個data structure serialize成byte array。因為這個byte array就是網(wǎng)絡(luò)傳輸?shù)膁ata。我用了depth first traverse。不過等我程序?qū)懲辏虐l(fā)現(xiàn),用breath first traverse會更方便,code也會很簡潔
19. 超大圖的存儲問題
20. 給個Lock w/ two atomic method lock() and unlock(),請用lock實現(xiàn)一個文件讀寫的系統(tǒng),要求:
1: reader blocks writer;
2: writer blocks reader;
3: writer blocks writer;
21。設(shè)計一個web cache server,假設(shè)存儲網(wǎng)頁數(shù)量是10個billion,打算怎么設(shè)計
22.你可以得到網(wǎng)站訪問記錄,每條記錄有user IP, 寫一個程序,要隨時能算出過去5分鐘內(nèi)訪問次數(shù)最多的1000個IP. 這個好像跟著這個rolling window 的precision 有關(guān),所以我們暫且定為5秒鐘一次window
23. Design free and malloc.
24. how to design data structures for a facebook network and how to design an algorithm to find connection? How to optimize it if data is distributed into multiple computers?
25. design a deck class and member function to randomly select a card from those cards which haven’t been selected before. (You can assume the number of this function call will never be larger than the number of cards) For example, we have a deck of four card: 1,2,3,4. First it may select 3, then next time it should randomly select one from 1,2,4… And design a member function to reset.
26. google search design problem. How to distribute data and how to design backup system
27. 設(shè)計一個online chat system
28. design bit.ly url shortening web service。算法設(shè)計,后端存儲,中間層cache,前端load balance,最后是web analytics。
29. Design and implement an algorithm that would correct typos: for example, if an extra letter is added, what would you do?
30. Suppose there are 2 persons A and B on FB . A should be able to view the pictures of B only if either A is friend of B or A and B have at least one common friend . The interviewer discussed it for nearly 30 minutes . The discussion mainly included following points:
1. How are you going to store the list of friends for a given user?
2. File system vs DB
3. Given list of friends of 2 users, how are you going to find common friends?
4. If you are going to store the friends in DB then how will the table look like?
5. How many servers do you need?
6. How are you going to allocate work to servers?
7. How many copies of data will you need?
8. What problems will you face if you are maintaining multiple copies of data.
31. design structure for auto completion
32. 如何實現(xiàn)search suggestions。
33. 設(shè)計fb的系統(tǒng)支持like那個button
34. design 股票#,time,price;
-設(shè)計一個client side顯示股票信息,給出盡可能多的user case
-在給出的user case里面,怎么設(shè)計客戶端,使得客戶段性能提高
-怎么設(shè)計server端
-數(shù)據(jù)如何傳輸
-server端如何保存數(shù)據(jù)
-怎么設(shè)計database table保存數(shù)據(jù)
-不用index怎么提高數(shù)據(jù)查詢速度
-database是怎么實現(xiàn)數(shù)據(jù)查詢的(要求從database implementation角度解釋)
[架構(gòu)師面試題]