
Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
History is littered with hundreds of conflicts over the future of a community, group, location or business that were "resolved" when one of the parties stepped ahead and destroyed what was there. With the original point of contention destroyed, the debates would fall to the wayside. Archive Team believes that by duplicated condemned data, the conversation and debate can continue, as well as the richness and insight gained by keeping the materials. Our projects have ranged in size from a single volunteer downloading the data to a small-but-critical site, to over 100 volunteers stepping forward to acquire terabytes of user-created data to save for future generations.
The main site for Archive Team is at archiveteam.org and contains up to the date information on various projects, manifestos, plans and walkthroughs.
This collection contains the output of many Archive Team projects, both ongoing and completed. Thanks to the generous providing of disk space by the Internet Archive, multi-terabyte datasets can be made available, as well as in use by the Wayback Machine, providing a path back to lost websites and work.
Our collection has grown to the point of having sub-collections for the type of data we acquire. If you are seeking to browse the contents of these collections, the Wayback Machine is the best first stop. Otherwise, you are free to dig into the stacks to see what you may find.
The Archive Team Panic Downloads are full pulldowns of currently extant websites, meant to serve as emergency backups for needed sites that are in danger of closing, or which will be missed dearly if suddenly lost due to hard drive crashes or server failures.
Given an array
w
of positive integers, wherew[i]
describes the weight of indexi
, write a functionpickIndex
which randomly picks an index in proportion to its weight.Note:
1 <= w.length <= 10000
1 <= w[i] <= 10^5
pickIndex
will be called at most10000
times.Example 1:
Example 2:
Explanation of Input Syntax:
The input is two lists: the subroutines called and their arguments.
Solution
's constructor has one argument, the arrayw
.pickIndex
has no arguments. Arguments are always wrapped with a list, even if there aren't any.这道题给了一个权重数组,让我们根据权重来随机取点,现在的点就不是随机等概率的选取了,而是要根据权重的不同来区别选取。比如题目中例子2,权重为 [1, 3],表示有两个点,权重分别为1和3,那么就是说一个点的出现概率是四分之一,另一个出现的概率是四分之三。由于我们的rand()函数是等概率的随机,那么我们如何才能有权重的随机呢,我们可以使用一个trick,由于权重是1和3,相加为4,那么我们现在假设有4个点,然后随机等概率取一个点,随机到第一个点后就表示原来的第一个点,随机到后三个点就表示原来的第二个点,这样就可以保证有权重的随机啦。那么我们就可以建立权重数组的累加和数组,比如若权重数组为 [1, 3, 2] 的话,那么累加和数组为 [1, 4, 6],整个的权重和为6,我们 rand() % 6,可以随机出范围 [0, 5] 内的数,随机到 0 则为第一个点,随机到 1,2,3 则为第二个点,随机到 4,5 则为第三个点,所以我们随机出一个数字x后,然后再累加和数组中查找第一个大于随机数x的数字,使用二分查找法可以找到第一个大于随机数x的数字的坐标,即为所求,参见代码如下:
解法一:
我们也可以把二分查找法换为STL内置的upper_bound函数,根据上面的分析,我们随机出一个数字x后,然后再累加和数组中查找第一个大于随机数x的数字,使用二分查找法可以找到第一个大于随机数x的数字的坐标。而upper_bound() 函数刚好就是查找第一个大于目标值的数,lower_bound() 函数是找到第一个不小于目标值的数,用在这里就不太合适了,关于二分搜索发的分类可以参见博主之前的博客LeetCode Binary Search Summary 二分搜索法小结,参见代码如下:
解法二:
讨论:这题有个很好的follow up,就是说当weights数组经常变换怎么办,那么这就有涉及到求变化的区域和问题了,在博主之前的一道 Range Sum Query - Mutable 中有各种方法详细的讲解。只要把两道题的解法的精髓融合到一起就行了,可以参见论坛上的这个帖子。
类似题目:
Random Pick Index
Random Pick with Blacklist
Random Point in Non-overlapping Rectangles
参考资料:
https://leetcode.com/problems/random-pick-with-weight/discuss/154024/JAVA-8-lines-TreeMap
https://leetcode.com/problems/random-pick-with-weight/discuss/154772/C%2B%2B-concise-binary-search-solution
https://leetcode.com/problems/random-pick-with-weight/discuss/154044/Java-accumulated-freq-sum-and-binary-search
LeetCode All in One 题目讲解汇总(持续更新中...)
The text was updated successfully, but these errors were encountered: