UIC Proxy Bookmarklet
If you often do your research from off-campus, you can install the proxy bookmarklet on your browser’s bookmarks toolbar. The bookmarklet lets you use uic proxy for E-journal archives without typing them into the url bar.
Installation Instructions
Internet Explorer:
* Right click on this link: “UIC Proxy”
* select “Add to Favorites”
—OR—
* click and drag the link to your “Links” toolbar.
You might see a security alert that the “favorite may not be safe”, although it is. Click “yes” to continue.
Mozilla/Firefox:
* Right click on this link: “UIC Proxy”
* select “Bookmark This Link”
—OR—
* click and drag the link to your “Bookmarks” toolbar
Safari or Google Chrome:
* Enable bookmarks bar
* Click and drag the “UIC Proxy” link to your “Bookmarks” bar
For the other web browsers, you can always create a new bookmark, and paste the following code into the URL destination field.
javascript:(function%20getHost(){%20var%20url=window.location.href;%20var%20nohttp=url.split("//")[1];%20var%20spp=nohttp.split("/");%20spp[0]=spp[0]+".proxy.cc.uic.edu";%20newurl="http://"+spp.join("/");%20open(newurl);%20})();
Has been tested in IE 7,8, Firefox 3.6, Safari 5, Chrome 8.0 under Win XP and Ubuntu.
一个晚上搞定哈利波特……可惜失败了
今天晚上么心情码作业,(明天周六居然还要加课,我们老师太变态了,双休也么人来吃饭,失去烧饭的动力了),翻电脑里的存货的时候看到一堆哈利波特全集,记得是因为某位同学特别喜欢哈利波特,于是我下了电子版的全集,但是我实在提不起兴趣读这种少儿读物,怎么办呢……为了以后在小朋友们聊哈利的时候能够插上话,我准备花一晚上把哈利波特读完。
当然,作为数学学士,未来的统计硕士以及潜在的统计博士,我当然不会一字一句地去读,为了能最快地了解小哈利的主要内容,我准备码了几行代码来分析这本书里面哪些单词,短句出现的频率最高,以此来确定书的主旨。
阿拉用哈利第一本Harry Potter and the Sorcerer’s Stone来做例子。
先数一下字数:一共78546个词,比意料中的少了很多,要知道,中文的话10万字也就一点点东西,这么厚厚的一本居然才8万词都不到,看来英文的作者稿费难赚啊。
然后阿拉来数一下哪些词出现的频率最高,第一名是~~~the -_-||,一共3627次,下面是top 20:
the
3627
and
1919
to
1856
a
1688
he
1528
of
1259
harry
1214
was
1186
it
1022
in
965
his
937
you
862
said
793
had
702
i
650
on
635
at
625
that
601
they
597
as
525
看了一下top 100里的单词,都是小学生就会的,当然,可能英语就是这样的,这不能说明小学生就能读懂哈利,下次有空用别的书来做下对比才能说明问题。但是,这本书里面出现的单词一共有5658个,好像少了点,记得6级的词汇量是8k,GRE的词汇量是多少? 而且,出现频率在10次以上的单词只有983个,总共出现的次数是63428次,粗略看了一下,都是young,ugl,main之类的单词,也就是说你的词汇量只要超过1k,就能读懂80%的内容。简单吧。
接下来看一下由两个词组成的短句,以下是top 10:
of the
284
in the
262
on the
207
to the
170
out of
142
at the
131
he was
125
It was
115
to be
109
he said
105
这个……完全看不出什么意思……我们来继续看由4个词组成的短句,下面是top20:
in front of the
11
out of the way
11
the rest of the
11
the end of the
10
how to get past
9
the back of the
9
the three of them
9
he was going to
8
for the first time
7
turned out to be
7
up and down the
7
as though he was
6
at the end of
6
he said in a
6
out of the window
6
seven hundred and thirteen
6
the back of his
6
up in the air
6
Harry shook his head
5
in front of him
5
4个词出现重复的次数明显少了很多,不过我们看到有11次是有撒东西in front of the撒东西…………请大家注意,the three of them出现了9次,看来这本里面3人组还是经常集体行动的,然后有7次for the first time,不晓得都是谁的……有6次out of the window,毕竟是魔法世界,飞出窗户是很正常的事情,出现6次seven hundred and thirteen,713这个数字这么重要么?看过书的告诉我这是什么东西,密码么?还是生日?然后,居然我们的harry有5次shook his head,而且有5次有什么东西in front of him,当然,也就5次,比九九八十一难少多了。
继续来看7字短语:
for the first time in his life
3
a few words of what they were
2
and his work on alchemy with his
2
be mad ter try an rob it
2
blood and his work on alchemy with
2
caught a few words of what they
2
death is but the next great adventure
2
Dumbledore is particularly famous for his defeat
2
Dunno if he had enough human left
2
enough human left in him to die
2
famous for his defeat of the dark
2
few words of what they were saying
2
for his defeat of the dark wizard
2
for the discovery of the twelve uses
2
had enough human left in him to
2
have a clue what was going on
2
he caught a few words of what
2
he had enough human left in him
2
He was in a very good mood
2
his work on alchemy with his partner
2
基本没有重复的了,这次第一的是for the first time in his life,一共出现3次,阿拉可以大胆猜想一下,这三次可能是初遇,初吻,初夜。当然,阿拉还是知道哈利的初吻在第一本里没有被夺去,上述纯属yy。上表中his work on alchemy with出现的次数蛮多的,看到这里,阿拉可以大胆猜测一下剧情:someone do his work on alchemy with his partner, to defeat the dark wizard.而我们的小哈利会有三次遇到前所未有的困难……当然, 也仅限如此,发觉做到这里我还是没能抓到这本书的中心意思。
于是我做了最后一次尝试,15字的句子:
was too high to make out and a magnificent marble staircase facing them led to
1
swished and flicked but the feather they were supposed to be sending skyward just lay
1
the edge of a huge chessboard behind the black chessmen which were all taller than
1
house wandering around and thinking about the end of the holidays where he could see
1
that were floating in midair over four long tables where the rest of the students
1
all about the four balls and the positions of the seven players describing famous games
1
When they say every flavor they mean every flavor you know you get all
1
I was down in the village havin a few drinks an got into a game
1
Dumbledore when we met him in the entrance hall he already knew he
1
it grew wider and wider a second later they were facing an archway
1
发觉没有重复的了……而且句子太零散,要从中拼凑出主要意思还是不可能的事情,今天的尝试以失败告终……当然,阿拉还可以再做些改进,比如在搜索2字短语的时候剔除那些a the之类的字,或者直接搜索每段的第一句话,或者让我去研究一下text mining的书,看看有撒好的算法,不过今天太晚了,到此为止……
ps:本来还写了html代码做了表格画了图了的,但是space说文章太长了不让发……只能删到只有文本了。
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
| « Nov | ||||||
| 1 | 2 | 3 | 4 | |||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| 19 | 20 | 21 | 22 | 23 | 24 | 25 |
| 26 | 27 | 28 | 29 | |||
Recent Posts
- Assessment of Agreements
- Sliced Inverse Regression
- Merge *.001-*.00x files in Ubuntu
- Sophia S. Yu (余蕣華)
- Advanced Bash-Scripting Guide
- VM LG Optimus V root+刷机经验
- Miami & Key West, 7/31 ~ 8/3, 2011
- Fuji Natura Classica Set 2 Key West & Miami Beach (Fujifilm Pro 160 S)
- Fuji Natura Classica Set 1 Chicago (Fujifilm Superia 1600)
- Flashget in Ubuntu




