Showing posts tagged log

<378 words that are blocked on Weibo as of March 13, 2012>

I decided to run a re-test of my initial list of blocked words this morning. Below you’ll find 378* keywords that are blocked as of March 13, 2012. (Note: these are words blocked on Sina Weibo; this is not a list of words blocked by the Chinese government. Please read this Disinfo article before re-using content in this post.)

direct link

Of the 1300 mostly unique words I found to be unsearchable in my initial test in Nov/Dec 2011, 933 were subsequently unblocked some time in late-January to early-February 2012. But apparently, that was an overreach and as of this morning, 393 of those 933 have been re-blocked (words which include 五毛 [Fifty Cent Party], 轮奸 [gang rape/gangbang], and 梯恩梯 [TNT], among others). I want to double-check and confirm that some of the longer length words are indeed unique (that is, verifying what the root words are that cause them to be blocked), so in this list you’ll only find words that are four characters and less (though I noticed after the fact there are some non unique words; for instance there are a few with 八八). I added in a few longer English words that I thought were of note along with some others from another final Wikipedia list that I generated, giving us the above 378 words that are blocked as of this morning. Please note, these are terms that when you try to search for on Weibo, you receive an error message. As far as I know, you are free to post these words in a message. (Of course, there is the potential for censoring after the fact…**)

For more about this project and how the Chinese government persuades Internet companies to self-censor, you can read my article up at Waging Nonviolence

*Update: Forgot a few numbers like 64, 八八, and 1989. I’ve appended them to the bottom, but also removed a number of non-unique words I spotted after the fact (I left a few of the more interesting ones in) so this list now comprises 343 words.

**See this Carnegie Mellon study and browse WeiboScope by the Journalism and Media Studies Centre at the University of Hong Kong for more on Weibo posts deleted by censors.



<Coded and categorized: a sample of 219 blocked Weibo words>

Back in December, after I’d completed searching through half of my 700,000 word list, I decided to look more closely at what kind of words were being blocked. I used the 218 two- and three-character words that I’d uncovered at the time to be blocked (only 2 one-character word are blocked: 屄, cunt / ; and ҉, a Cyrillic character that is associated with backwards or bi-directional writing) as a sample and then proceeded to tag them according to whatever categories I began to see developing. (The categories are at the end of this post and on the second page of the spreadsheet as well.) I’ve since finished my search and have about roughly 500 unique blocked words, which I hope to give the same treatment in the coming months.


direct link

As would be expected, most of these three-character and under keywords were names of people (most Chinese names are made up of a one character surname and a one or two character given name). 87 of the 219 were names of people, and the vast majority of those people, 54, were CCP members. Nine of them were involved with either corruption or other controversy in which they were usually dismissed. Fifteen of the people are dissidents of various sorts.  Three are criminals who were neither dissidents nor CCP politicians and are probably listed because their crimes were so gruesome.

Read More